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ABSTRACT 

A cellular array is an iterative array of identical information processing 
machines, cells. The arrays discussed are rectangular arrays of programmable — 
logic, in which information stored in a working cell tells the cell how to behave. No 
signal line connects more than a few cells. A loading mechanism in each cell allows 
a computer directly connected to one cell to load any good cell that is not walled 
off by flawed cells. A loading arm is grown by programming cells to form a path 
that carries loading information. Cell mechanisms allow a computer to monitor the 
growth of a loading arm, and to change the arm’s route to avoid faulty cells. 
Properly programmed cells carry test signals between a tested cell and a testing 
computer directly connected to only a few cells. The computer may discover the 
faulty cells in an array; and repair the array by loading the arey's good cells to 
embed a desired machine. 

Terminology and network models are developad to describe the 
characteristics of a machine that are important to the test and repair of an array 
embedding that machine. Important machine classes are defined, and their test and 
repair requirements are compared. Computer simulations of repair aid this 
comparison. 

Each machine class is represented by a particular cellular machine 
design. Arrays are presented for realizing highly-integrated, computer-maintained 
memories, such as variable-length shift-registers, randam-access memories, and 
track-addressed sequential-access memories. One flawed array of simple cells 
rnay perform like any digital machine, within limits set by the size of the array, its 
number of input-output leads, and the speed of its components. One such machine 
can test, configure, and repair its cellular environment. Applications for these 
cellular arrays are discussed. 

The thesis’ approach is oriented toward the realities and trends in 
large-scale integrated circuit production; and has potential integration level, 
reliability, maintainability, and flexibility advantages. 
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CHAPTER 1: OVERVIEW 


Section 1.0: Introduction 

A cellular array is an iterative array of identical information processing 
machines, cells. Test of an array discovers its flawed cells. Configuration of an 
array programs it to behave like some machine. Repair of an array programs it to 
behave like a desired machine in spite of faulty array cells. This thesis develops a 
practical systems approach to highly integrated, computer-maintained cellular 
machines. The structural simplicity of cellular machines gives them many 
advantages, especially now when large-scale integrated circuits (LSI) are 
proliferating. We specify cell mechanisms and outline associated support programs 
for ‘A arbitrarily large, two-dimensional, rectangular array. While we focus on 
two-dimensional rectangular arrays, our approach has obvious extensions to arrays 
with different interconnection geometries and more dimensions. This approach . 
allows a digital machine to electronically test, configure, and repair an array by 
_ direct communication with only a few cells in the array. The fact that a computer 
can test and repair an array implies that the array need not be perfect. All the 
cells of the array may be simultaneously produced as a very large, integrated 
array device. Such a device usually has faulty cells. After the array is first 
fabricated, a computer can find the defective cells in the array and load a perfect 
machine, which incorporates only good cells, into the flawed array. Thus the same 


mass-produced device may be program-customized by a mass-produced device, 
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the computer, to behave like « desired machine. If this array-embedded machine 
develops a new flew during its operation, and if this flaw causes a noticed 
performance degradation, the array may be partly or completely re-tested and re- 


be maintained by a digitel-mechine, furthermore, the array may be re-customized 
at any time. This approach is tailored to the reslities and trends in design, 
manufacture, ‘distribution, arid maintenance of digital systene, particularly those 
| We discuss errays of “programmable logic”, where information loaded 
into memory elements in ¢ working coll tells the cell how to behave. No signal line 
this allows @ computer directly connected ts only oie’ cell“to toad any good cell 
that is not welled off'by flewed cells. The ioediig information which the computer 
sends to the array may select one of a large set of fdbsible paths for a loading 
arm that carries ineding information to a call. We develop tell mechaniome that 
aliow the computer to monitor the growth of » loading arm to a cell, and to change 
the route of the arm to avoid fautty celts: A mathod is described for testing cells 
in an array ‘by using a test machine directly connected to orly a few cells in the 
array. Properly programmed cells carry test signals between some newly tested 
cell and the test machine. A loading erm may be used fo vary the state of the 
tested cell. foe 
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Programming an array to behave like a given. machine is called 
embedding that machine. When a machine is. embedded in en array, it should not 
allow faulty cells to affect its behavior. Therefore an embedded machine is 
programmed to ignore signals sent. from faulty celle, We find that the 
communication pathe required between the essential cells of an embedded machine 
affect test and repair of an array for embedding that machine. Development of 


terminology and network models allows. ug. to.d cribe. embedded machines more 


precisely. Important embedded machine clagees are detined, and their. associated 
_ test and repair requirements are detailed. Computer, ‘simulations. of repair 
facilitate this comparison. . 

For each class. of machine that’s described, a porticulee, potentially 
useful representative of that class is detailed. All arrays’ cells contain. our Joading 
mechanism. Arrays are presented for raslizing. highly .integrated,. Sompaiter- 
veriable-length shift- 
registers, random-access memories, and track-addressed sequential-access 
memories. One array of simple. cells may be pragrammed to. embed an arbitrary 
_ digital machine, within limits set by the size of.the array, its number. of input- 
output leads and the speed of its components, .An array-ambedded computer can 


maintained memories. These include arrays for. . valizin 


test, configure, and repair ite cellular environment using. techniques we. develop. 
Indeed, two or more array-embedded computers can teat.and meintain each other. 
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Section 1.1: sia te laieacecmaraicion 


proéch requires Introduction of 
machines, cells, interconnected in en‘iterdtive way. Each cell of an n-dimensional 
directiy with ottier machines through a finite eet of signal tines. Figure 1.1 shows 
8 possible layout ot a colider array. Each colt In a given array hes a fixed number 
of signal tine ‘sede-sets, wich corresponding to potaitial direct communication with 
another cell, a netghbor. \f any member of a side~-sel corinects to a neighbor, all 
members of the side-set! comect to thet neighbor. W's’ skdé-set dosen't connect to 
a neighbor, some or ali of is menbers iey ‘connect to an extre-array machine. 
Uneennected inputs ‘ect as if they are connected to a biriery 0; thie is easily 
‘Implemented. We concetitrate oh checkerboird caliular arrays, two-dimensional 
"arrays like that shown in figure 1. wee hcl Wi Hr deat, wi input 
and output signet lines at each side-set. Wi se the term “theckérbosr | 
for an identical cell. Some have proposed arrays Ii Which signal busses run 
bus fe damaged, we'requre that there be no signe! busese in checkerboard arrays; 
at most, ¢ signe line connects 0 cell te ite four neighbors. Checkerbosrd arrays 
are well-suited to the step-and-repest nature of current integrated circuit (IC) 
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Fig. 1.1 rca or A puteserboetd Array Connected 
: To Two. Extre-srrey Maghines ~ 


Key: An arrow indic ae one or more of a machine 
- SAPOPH: directa ee re direction oF cit ai ag 
- represent extra-prray. peracigltss The ) smpall, Unlettered. boxes. represent. cells... 


to = 


Fig. 1.2 intercanneolien Network Far Checkerboard AreapOfFigure ti 
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production. These arrays also ofter-testing, ” ion, al raalr eaveteees 

An snterconnect{on network, such as that of fire 1.2, partially describes 
an array’s layeut by showing how ‘each coll arety ometunicetes with its cellular 
neighbors or @xtre-array machines. in ee Interconnection network, each node 
represents a coli, and each dlemond raprespnts an exdra-wrray machine. A node is 
linked to enor nade or cmon It and erly Mone of fee. sige! lines directly 
link the aseociated machines, | A.nede han dares» ifn links soonget to the node, | 

It’s gbvicus thal» ubvarray ot ap arvey ie-aied an arrey. “Gonsequently, 
it's valid to inaiste the activity in a sub-pirfiy ee ery iy ond treat the cells 
outside Wis array ap extrstauileray mecha, 

We acid on proqraumplis tight chesbarbebrd arrays, where each cl 
contains functton-spectfication state bets attacting which of several operations the call 
performs. Both celle and arrays are viewed as having two functional pid a 


aur isaiby 


loading laye® and a processing. layer - with. dotnet in uate, “eure: one, memory 
elements for éech layer. Of course, these layers. may re physically intertwined. 
At any given time instant, only one of a cell's layers is activated. The processing 
layer of an errayis-ueed te provide the: functions of theieray that cre innediétely 
_ useful to an array’s user. The processing layer’s output and stete are a function 
of the processing layer’s input and state, andl the function state - the state of the 
function-specification state bits. The function-specification state bite mey enter a 
particular function state when an array is powered on after this, they may only 
be loaded through use of the array’s loading inputs. The sole function of the 
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loading layer is to load theea bits, and thereby.attect the functian performed in the 
processing layer of the array. Thus the function specification: bite: act as 
intermediaries between the loading layer. and Abe: processing. layer. - The funetion- 
| specification bits are the. only cell logic slemente: that. are not dn either: teyer. 
Typical use of an array involves loading the functic rene fication etate bits, 
loader is quiescent while the processing lavas pectormpie function; Re-use of the: 
loader may re-program the processing layer to. provide sama. new function... 
| | 7 ueer expects. certain. variables of 
_ an array to remain fixed during a given. tine. intervel.;: Bar iestance, ae application 
might dictate that an array of programmania login pong, honda, dmor the game 
__ Processing input-output leeds and function state, The,unerof thle. array. would 
justifiably think of his, array: aa an. oovianenpnt, fae &.mechine. embedded. in the 
__ Processing layer, with the. fixed attributes of. the arzay:pmpctiying the sommbadkied 
machine. Similarly, a user might only uee an array’ jonding Hneute during an 
| interval devoted to loading function-epegifigation. state. bite, , The: seer: would. then 
think of the erray as an environment. for a.machine embedded in. the. loading layer 
during this interval, Finally, # user. might,iate pn peecersen-ol jending; using 
__ loading inputs, and testing, using pracessing, inputs and-udpute, during ar interval. 
- The user could think of an embedded machine es, segupying: both: lnadiag.:and 
__ processing layers during the interval. As.we mightiempent, the nature:of the 
embedded machine profoundly affects the sagtability. end sepelnebiity: of en-array. 


For many actual orrey.e NICAL EN 
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Knowledge of the constraints on an attay during an interval affects 
testing and repair v0 prefeundly thet we develop a language to describe these 
“eonetraints. We conetder definitions rétative to use of eit ‘they citing @ given time 
interval. The seer identifies which side-set lines et fhétmoty elements directly 
“Interest him during en interval; these are thé intereitiig elements. ‘The input- 
output leeds of ah array that connected to @ usér’s machine might be the 

interesting tines for that user. Simnilerty, func! fléation bits which would 
affect the function en array pertormed for & user might be interesting, while 
function-specification bits in a remote section af an array might be uninteresting. 
Thus interest is defiried in terme of a usie's intended’ application. ‘The state of the 
array at the beginning of the interval, and the ‘signals it inay receive during the 
interval, affect whith-of its memory elements and side-sdt lines ere relevent; that 
in, which tay affect interesting sloments during the Interval. An embedded machine 
for a given array 1s ‘described by a fist of these relevent memory elements and 
side-set input lines whose values are known Yo be fixed during the interval, and 
their associated values. The embedded mactiine’s input-output fines are the 
relevant input-output fines of the array thet are variable during the interval. A 
progremmebte-tege loading mechantem may set function-epecification state bits, and 
- thereby: pertisily er eompletely spetity the machine embédded in the processing 
layer of an array. Figure 1.3 gives a charactérization of one such embedded 
. machine. broteees Peete in rc are iat rs ee ay Oe 

‘Performertce ot te aebedded machine, 
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_ Fig. 1.4 Relcon Network For The Embedded Machine OL Figure 1.3. 
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Different embedded machines may be equivalent. if a flawed array is 
configured to embed » machine that is equivalent to Perfect erray's embedded 
machine, we séy the flawed array hee been repesred te ened 2 perfect machine. 
Our array repaiy ie therefore an intrmaton preciessating te wets of cells, and 
not a mechanical alteration of an array. . 

A relevant connection network, OF relcon network, e a subnetwork of the 
interconnection network that describes tonmmusicaton in an: “embedded machine 
(see figure 1.4). Av in tho interconnection network, dots correspond to coll and 
diamonds correspond to extra-array machines. tnd only if a feast one relevent 
connection directly links a cell with another. céll. Or. extra-atrey machine, a link. 
connects the cell’s dot to the appropriate det or diamond in the reicon network. A 
cell’s reicon netghders ere the entities - cells or extra-array machines - whose 
representatives are directly linked to the cell's dot’in the releen network. A cell 
with n reicon neighbors is a cell of recon degree n, called a relcon-n cell. An 
embedded machine's recon is the highest reloon degree of any of its cells. 

A qualification of our definition of an embedded machine makes it more 
consistent with our intuitive understanding of a machine. We require that two 
cells in the same embedded machine be connected by some path of reicon 
neighbors; that is, an embedded machine's reicon network must have some path 
between their representative dots. Thus two or more machines may be embedded 
in the same array. 


An array constrains the reicon network of mechines embedded in that 


Ae Si lee akon a aoe iO ee 
oe : : 
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array. Two cells can be reicon 


wed kaa, Fe 


LEP Noa, Neha Jn the, area’e 
interconnection network, Furthermore, the eet of elawed cel eaten may 
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" Seetion 1.2: THE Loader And Related Cénebpte ~ 


“Trastyniiiten dates ave fundamental 


to many of our testing, configuration, 
and repalr opérations. in a transmission slate, some ‘inputs to a cell are 
| idtinied #6 dulpute after a dels la testa == oe or 


The first substtipl - U, R, 6, or L - denotes ine of a cell’s four side-sets - Up, 
Right, Down, or Laff. Each sde-tot of « give cal han the ‘same rumber of lomding 
* inputs and outputs, M; an me number St processing inpute and outputs, tM. 
For all 1 < KEK ie true thet fas bs bi bas Ou ‘Beis Oo, and Ory are” associated, 
and given the sime neme. This we might apeak of the Select loading input and 
output of each of a cell’s side-sets. In a loading transmission state, each loading line 


of one side-set is transmitted to ah associated loading output st one other side- 
set after a delay not longer than about one gate-delay. Thus a loading 
transmission state busses the loading inputs of one side-set to associated loading — 
| outputs at another side-set. Processing transmtsston states effectively connect a bus 
to every side-set’s processing outputs. Each bus connects the processing outputs 
of a side-set to the associated processing Inputs of eny one of the coll’s side-sete. 

Transmission itnks are important to our testing and repair processes in 


some arrays. A transmission link is a processing layer’s chain of cells in 


’ sighal Bua, ¢ Dy bee gh Gurr, 
input at one of toca to an ascined poping ouput at ite apposite end. A 


transmission states that acte’s 9 two-wik 


transmission link performs the same busing tncin in an array. independent of the 
link’s path or length — 

Our loading seench ee a nog raion sate to tranemit 
loading signals th'the'Inpute of'd cell’beltig loaded Ih teetiig arrays for’ ‘emijedding 
some relcan~3.and reican-4. machines, traneiniesion feuie‘Gunuct test signals from a 
test machine, such ae a computer, to 0 tested cal These seme links concurrently 
ili: Arrays tested in this 
way are repaired by linking clusters of good cotte vie tranemiesion links. 

. An embedded machine. crn.is.a chain: ot selene maighbors (see figure 
1.5). The- arm's sp shan-ralcon-t, end-all ee have reicon-2. The 
relcon-2 cells ere the arm's body ‘The: hase of thar 16h Cal fettheat from the 
tip in the relcon network's chain: A loading an is used ‘to Food the celis in an 


return a tested cull’s response: beck tovthd tet a 


vedevenovesesy memati eo 


array; loading signals flow from the loading brnt’s bees to ite| tip, where a ceil is 
loaded. ees oe stay 2am eee a eens 
layer of a machine. 

we develop a loading mechanism; that a be ‘coupled with any 
programmable logic processing mechanism in an: ‘array of two | or more dimensiones. 
This loading mechanism allows the Teasing” of a cell in a perfect, arbitrarily large 
__ array by signals input:te one cell anywhere In the errily: “PIs is-possible because 
@ loading arm may be grown to the londed c$t The loading arm is an arm 
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Fig. 1.5 Reicon Network For Two Embedded Arms 


ny 


tips 


Fig. 1.6 Relation Between Essential Network And.Associated.Reicon, Networks 


A) Eseontial network for figure.1.3's embedded machine 


B) Reicon network for figure 1.3’s embedded machine 


ar 


D) Relcon network for the embedded machine in.C 


,, and the leading arm is. iners 


2 ed Rela ea 7 peeenseinete pane cee e Bie 
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_ embedded in the loading layer, of ap. stay. Colle.in. tha: body of. the: arm.ere in 
loading transmission states, carrying landing sigaals. fram the, beae of an arm to its” 
side-not are sufficient to atest ay desired Joeding, hehe af a callin a perfect 


array, and for most calls ine fiqwed. errey,.. When deading siderent, is-sehiveted, a 


_» Cll, prepares to. accept Jonding inforpation,...Thie. inigrmetion ie. then: clecked into | 


the call through the.active. Joading Jlnnsy setting tha aplice soaden, aad. tunetion 
_ states. If these Igacing, inguite,romnein. active, she.loeder state determines how the 

Call's loader sybeequently. behaves, .The.cell may.enter: e.deading.tranemission 
_ State, in which it trenemits its. loading Japuts te lnadinaceptmte et. one-ot ite side- 
- sets. That ia, it may become pact ot:the Jprlyf.». longer ae. Jaading.eoove-new 
_tip.cell, Thus a cell: may load.one: of jts.npighhors,.wehich loads ane of, its 
_ Neighbors, .+..09 9 flax 


_ Gell’s loader. inputs to, ober Geils. in.an eenay. A tip. coll see haslnadas with a 

Joader state. prapering it, t0.be, ve-loxded this, is.wpeiul, whan», cell is tapted in 
various functign states, ..A losded Jipscal , en saven-ite caleon, nsighbpr ta. the 
loading arm to be the new tip; the former tre lgeing lara. srcaativatod 
mentally retrected..: A signal to the, bese: of @n arm can 


_. Qlea cause the arm to. totally. ctatrect. Abats ite the siaryi.. 500. derectivate, all. the 


> Stn. greatly itis used te..grow, a path thel.oereeecloediag: intervention tram one 


end. the erm.in one of severe: directionyiaetcest the arm, or eppeaiady — 
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change the state of the arm's tip. Because oly @ loading art's tip call can fave 
its ‘function stete changed, a cell's temporary’ role’ as i arity tip te sufficient to 
“opermanently set its function atete. A’ foading afm may sfso ta-load cells In an 


ied or ltreguiarly 
shaped erray, the arm's ability to snake throigh alternate paths, twisting round 
flaws and retracting when necessary; gives it edventaged.” Chapter B detells the 
oncor, ane ergseos ow fader eptonn etend'the Weng era Epa. 

Use of @ bstenced loader and’ balanced fanctis gtates facititates 


ervey, in order to Ye-eu 


- active property 6f the loader Implise that » workitig call cali'be sctivated, lebded, 
"and de-activated: by: #: loading srnt-finked to: thé lenider inputs’ of one of the cell’s 
"used te activate a colt for loading, load the cell ‘with a*desired function state, 
eubsequently send leading ‘signals: throiigh ‘the détt te’ Kitt! vide-aet’s fosder 

outputs, and de-sctivate the celfs loader: The et eer ee = ee 
arm to funnel the seme ieading command to ‘a éfm-tip't wit ‘dT the 

erm's path through an array. 
~The term "betance” is used to indicates cell rechanisin’s functional 
symmetry with reepect to ite side-sets, A ¢ell's procéeslig’ mothenism may be 
: balanced for some or all of the cell’s function states. | ‘Consider some function state 
S, of the processing mechaniom.: This stete can be cds 


PAGE 26 


of statements relating each subscripted pracessing. aytput - Ouanin- - - On - 
nd each, procgssing state (it applicable:, that js,.it.the aracepeing, state.cen effect 
some processing output in function state Fa) tp sabeceinlad Rrapeanine.iqpute-end 


set of statements - such as the permutation interchanging L and U, but keeping R 


and D where they are - yields a set of statements thet completely describes some 
allowed function state, the processing mechaniom is belenced for state S, and the 


balance-related function states generated by the side-set permutations. if a cell is 
balanced in every allowed function-state, the cell is balanced. Sometimes the 
construction of a cell requires disallowed function states. One might, for instance, 
use four function-specification state bits for thirteen allowed function states. 
_ Three function states might be incidentally generated, useless, and therefore 
disallowed. | | 

| An example clarifies the concept of a processing mechaniem’s balance. 
Consider some processing mechanism with one processing input and one processing 
output at each of its four side-sets. One transmission function state F, of this 
mechanism is described by the set of statements below. 

For Fa: {lp + Ou by > Op b> Op ip > Op} 

An arrow indicates an input is transmitted to an output. This processing 
mechanism is balanced in state F, if and only if the cell hes function states F, and 
Fe such that the following statements are true. 


For Fg: {lp > Op Ip > Ope h, > Oy by > Oy} 
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For Fo: {lp » 0, IL » Op, ly ad On Ip > Oy} 
If the mechanism is balanced in state F,, it is obviously balanced in states Fg and 
Fo. We then say that F,, Fy, and F, are balance-related. If the mechanism is 


balanced for every allowed function state, the cell is balanced. 
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Es #5 a oF Eee 


beta . We ical ek oleate ‘aeons an ace 
sok On sorni the (xsl, one: wen've weap, thet apanifion reochien, that allow. 
. teak, configure, and. repelr, 20 erbikarily.laree.diawed array. vin ineds 
, connected, toe » fay gel in. the. array, . Tole anpgperh reuices.sequmptions..sgout — 
_, feulty behavior. Ona.basic assumgtion is that a goad: call ja; loaded uncier.a: test 
machine’s control, and not by signals couned by faulty, cplie:...In any.anproagh thet 
allows appropriate signals into a finite set of cells to affect loading of a cell, there 


is some chance that faulty cells will provide those signals to load e@ cell without a 
test machine’s control, and therefore centradict this assumption. We describe | 
design techniques for making this arbitrarily unlikely; thie involves making the set 
of valid loading commande smaller then the set of possible loading commends, so 
that fault-generated commands are likely to be disobeyed. Another basic 
assumption is that the behavior of a cell depends on that cell’s state and inputs, 
and not on the state of other cells. Our checkerboerd arrays help assure the 
validity of this assumption, because no signal line connects distant cells. A third 
basic assumption is that a faulty cell is somewhat consistent in its faulty behavior: 
if a cell is good whenever it or its neighbors are tested, the cell must be good in 
the intervals between tests. if a good cell becomes flawed, it may not pretend to 
be good whenever it is tested. The first and third eesumptions are met if a faulty 
cell’s outputs all remain stuck at some value. Another asseumption, which is made 
to reduce test time, states inlepenabnce of certain mechanisms in a cell. For 
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or stage ‘does inot affect the 


instance, it’s assumed that the state of a shift-repis 


Ultimate: justification ot thite deeuntiptions Haiti 


Secton 1.4 F 


ry gy bens) Meoved 
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A array at ih ei aS Age eh tach ee EMTD aie mre ef ‘ 
of belanced colle theee e, 
Anas susitgen sow 3b me wary vie Ks 


bear: pets communication of a digital machine with @, side-set of one 


METS DYBOG SS at 


working bese, call ing, flawed, arbitrarily large. array inhi of eo . 


fat % 2 Pea BBL id 


: array. A loading arma gradually extende en on embedded arm 


9 _ *o) nS wsisieat Yhuel a giasease 71 reward ne, he tabi hed 


After each extension the arm is tested vi processing sina becca incaste: 


wie pA OR > wR iyiQakes dcr ag APRA RICA 8 SBUBGS PT IGILUGMHS > SIGH IOA 
Programmer and the base of the erm. If growth Is unsuccesstul, the arm is 
ar log weent cl elgg 2B BAG SLURS” GTUASSAD ATA St ceo 


"retracted snd then grown through» new ne feos ecco. he belence of 
oe oe Ae ecigelumeray saiccam “aac 1 \p eo ON: B ce DST 


deg cells in the arm machin with fills 
ih} Late gaa ch lay By p imply that BMI) Pastis tives ary Ying one wre number of goed 


Bt Be ead BYRTES “HH land lene: ne of ek Breen aS Beir ead ae na 
consider some embeded arm machioe MI woes T freon states a 


fern « ni aeqitae sc roe ee beviset yineshoorgga ad bluce yen) oe 2 


by the list (F,, Foy . - » Fry). Fy norm file bie 


YRVIS EO 


__ the first call in the arm, and Fi is the function state ef the Nth cel in the arm. 


Sth BaP ae eachets 55 aa: teaby ba at Bi &. SUSE Aaa Heid F 


Fr is the function ae of the arm's tip. Becaues all the function states are 


heesd HVS eae 8a acd: aie ie #3 


Sapa etree earth ttt ene 
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aio isA$ ait esol aye anidoo > Sobradima ng tc ile imines anT yew eine oF 
machine that hes qT function states ie Fre. ieee Fah ond F, is belance-related 
Roy itse Boeier TEMS] Se ton elle 8isgo Oi BRT Brn mad 
to Fig for al NT. . . io rs 
Chapter 3 deciees tat gover of machines, 
Agios Lolttyca ait Tent programe laliiggen AB Maowse EAE TTH MGV Bet 


“i given r reasonable models of flawed behavior. of arm | prays is is studied 
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: skeet ® program th that simulates that repair (009 9 3.17 and 218: for 


a PE Sif 228 7a Judie PIE: 
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alanine vert reed realization of 


5 spas Pages sedge owe, Tee 
Mm ES # EBSD eamrercy shes ee 


vis 


package. ie are meee to ms mci which are 
Péelized as a chain of modu 


an, wilh dah inerkds Conemestcading with ot most two 
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7 checkerboord « array. 
An esientéal mechine is a perfect machine embedded in a processing lee 


4 gic "ye Hat aed? ¥ 
that ls described te mechine composed of secon colle that sre wired together 
Be sed Deets! teary APG PEE 


in some way. The cama cals f tn onbedied rection are thee cols that are 4 


~ not in nén-brencing transmission stiles: the tow calle in the epee square 
of figure 1. 3 are the seneecniat cells of that énbedded machine. A wire ‘corries 


“relevant information bet 


éet ch ever call rd It etal mgr, which i 
sothér siete! clio eire-cry maine A wire ie a dhrected signal path 


_ essential cells. If an.easential acm. machina, 
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indirect (via cells in transmission states). The output of the.slement.ia the.middie 


cell of figure 1.3 is wired direg 


ye. the ascontial calhaove:it, and indirectly to 


_. extra-erray machine, 


| An. geetetin ection. In secdbondiplie astoon sbvnielant sboeided 
#.9n armawith-T celle, whoes Mith:call is 


_,. in function state F,, 1 <.N.s T, then.an ambedded machine is equivalent if-end only 


if it is also an erm with T celle. whose.Mth 4 


saliis belangesralated toFy. 
oan. Sustas @_rpleap network dapsslovn an sebeedind restion. 98 eogential 


_. for the essential, machina.in figure. 1.2. A,aquere inthe amtwark etende.for an 


_ ersantial.cell, aod sides, of the. square stand. fer, sidetantecf:the aasential-call in 
_, the obvious way. Diamonds stand for aadce-geney meckinns., 36 andcenty: fa ‘wire 
Connects an essential cell's siderset te..0n anseotias naiubber'sside-set, an 
associated link appears. in. the essential..network.ja: the-omected way. An 
-_eapential patwork jg bviqusly ralated.to Beacralcon.petwerkacet, the embedded 
essential. network must have one and only.ene eorrenpending xsrential sede in. each 
essodaied reicon network. If two celis are essential neighbors dn. son esagatial 


‘ iis @ach relcon: natwork must. have acpeth-hehween sersespanding essential 


» This.path may bee link <hetwepn senential ealin thet -are-latencannection 
nian of links (corresponding to. 
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Atgh-retcon machenes. All embeddings of a high-relc ast asia Dae Geeta eu 
with three or four reieon seer PIO eee: 
and differ ‘Only inthe fength of wires! donnectid | 


wire in one high+réteon: embedded: mactine, there is one afd’ rly 6ne wire In an 
equivatent machine. Since essential stéted of our tgh-rélcon Machines ere 
unbstenced, corresponding essential célle-tust be-in thé Sine functivn’ state. 
Corresponding éveenttal calle eré wired to otter corrédponding’ cells and extre- 
array machines if the same way. Only the letgth of weeciitae wird vadiciia in 
equivelant embedded machine; chapter # Ulectivees aban: implications o! 


of different essential cetis for simiplidity’s bike: ‘This 1s reasonable, sinc 
designer typically specifies the einiptust esseritial ‘machine that will perform a 


Our repair of high-releonh machines requires’ the ‘sceumpticn that the 
length of associated wires may differ in equivalent éribedded machines. wife we 
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most pepropriate high-reicon machine arenitertures. The assumption that. the 
: length of associated wires may difter in oauivalent ented machines Is. not: 
| required for repair of arrays ombeding t! the other machines we otal - 
_ "Uke captor 3 caper 4 foes on testing conigraton, and rep 
| We show that the mechanisms that Provide these facilities are very | close to 
comparable mechaniome Presented. for, chapter. os Stays... whe loaders: are 
functionally identical _Testing is sccomplished by growth | of transmission | links 
: between a test mechine and a tested col, TA fest. dink is a transmission, link 


| between a test machine and some cell in an array. The, be 


to the test machine end the tip. of the. Aik is the call, on the apposite: ond of, the 


| “test link Each cell in the test link conducts egate to end fram the tip end of the | 


link, here a tested cell may be located. The test irk is frown as the body | of a 


: test arm, which is a test link terminated one cell in a “U-turn” function state. 
- Sinai from a test mechine into the. bees of this, deat, orm flaw v gown the arm's 
| body to ie tp try return to th bane ofthe arm and ort the tet machine, 
The test machine uses such signals to monitor the growth of. a test arm. The 
| balanced states of cells in the arm allow it to flexibly eneke around. flawed colis as 
it wows from a test machine t the side-sat of a tested call. on links ore grown 
“to al the accessible side-sots of a tested eal call | These We sow 1 test machine to 
monitor the tested cals behavior inv various ‘unetion states, which » wre sat by the 
lender. 


__ Repair of arrays embedding high-relcon machines ie d by use 


es peas Aree PS eaye 
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> of tranemission states to wire e together essential neighbors which aren't 
"interconnection neighbors. Experiments with « repair simulator aliow us to begin 
to compare thé repelr costs involved with different covet machines. “The most 
‘difficuit checkerbonrd-array repair invelves  Gebedding. ‘ ‘ieh-relcon machine 
whose essential network is a grid (see. ‘figure 19.4). We call such « machine a 
grid machtne, ard call ike essential colle grid cells. “fhe most general repair of 
” checkerboard array assures thet grid will be einbedded in the array. Many bigh- 
reicon machines have essential networks with equares and links miesing (see figure 
“1.7.8). Weill see that thee machines are embedded in a fewed errey more easily. 
than grids mecfiines. Tis shows that Knowledge of the constraints on relevant 
inter-cell communication paths curing an ‘interval fecitates rapa, but may Y rece 


| re-repeir when the interval ende. . 
Chapter a also details an array of simple ‘cells designed for the 
realization of arbitrary digital” machines. Others have described Infinite arrays thet 
may contain iitilly-Finite machines capable of perfering any computation ond of 
“constructing other machines that can perform ny computation dur orrey ts the 
first one we've seen capable of embedding ry ‘universal computer-constructor- 
repairer. A computer may be embeded in «tite perton of the processing layer 
of the array. A function state that tranemits procescing inputs as loading outputs 
provides this compaiter with a loading ar arm. (This ie the an ‘time ‘thet. our Previous 
description of programmable logic is slightly uearect in - processing inpute 
may affect loading outputs) Under this embedded computer's contrel the loading 
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Fig. 1.7 Essential Networks For Two High-relcon Machines 


A) Grid 


B) Non-grid 


"th 


Fig. 1.8 Relcon Networks For Two Equivalent Tree Machines 


A) A tree with several branches 


72 


B) A tree that is also an arm 


Wes 
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arm works with four test sits to fest, ccaaten: and bapelr: the computer's 
environment. The machine may construct more memory for itself by using its 
loading and test arms. Furthermore, two © atiacomputere ermbedded in an array 
may test end repeir each other. We bridly datdhibe an embedded machine we've 
designed as the processor of a universal computer-constructor-repairer. 

Chapter 4 also discusses practical production issues end application 
areas relevant to high-reicon machines. mes 

Chapter 5 discusses pi icacding ajar’ Wiuichinba Rilled Wieraackenee 
Random-access and track-addressed sequeniisl-aconss memories may be 
efficiently reslized as tree machines in awed errays. This is true hecayse tree 
machine realizations we approprite to mechnes which mey be viewed os a onal 
set of mecies with «canon input bu and connon output but withthe output 
bus accessed by only one active module’ at: a civen time. Each ceil in a tree 
machine is a balanced, essential cell whose function state includes a unique name. 
All embedded tree machines have tree-like Mile iiftivark# ~ Alcon’ Networks In 
which a tree trunk, which may or may not-heveletfsheot branches, extends from 
the tree’s base cell (see figure 1.8). A tree's bese cell is the only cell that is 
directly connected to the input-output lines of the tree machine. Two embedded 
tree machines are equivalent if and only if they have the same set of cell names; 
the particular shapes of their tree-like relcon networks are irrelevent. Thus an 
embedded tree machine whose reicon network is an arm may be equivalent to an 
embedded tree machine whose reicen network has several branches. Tree 
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- machines are embedded in flawed arrays more easily than arm or high-relcon 
machines. If there is any path of good cells between two good cells in a tree 
array, those good cells may be incorporated in the same tree machine. Interwoven 


test and repair processes for tree machines are like those for arm machines. 
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Section 1.4: Array Repair — 

Chapter 2 reviews particutarty relevant work Involved with cellular 
arrays. Meny have’ prévented particular array“desighd. Sore have ‘presented 
methods for testing snd reptiring particular arrays. Muebt of tiiese methods use 
custom metallization, but some use programmed repair. Some have concentrated 


on necessary and sufficient conditions for testability or diegnosability of a 
particular type of array. We design cell modules which ere incorporated into an 
array to enable testing, loading, and repair. We aleo present the first systematic 
treatment we've seen of the affect sn embedded machine’s communication 
structure has on the testability and repeirability of an erray for embedding thet 
machine. | | | 

We describe how constraints on the wiring between essential cells of a 
machine affect testing and repair of an array embedding thet machine. Chapters 3, 
4, and 5 consider this question by focusing on three related classes of machine - 
the arm, the grid, and the tree. Figure 1.9 indicates ats these three classes 
relate. Given a flewed array is te embed a@ given ro of machine, we mode! the 


repair process in the following way. The flewed array is viewed as a flaw pattern | 


(see figure 1.10.A), with a dot corresponding to @ good cell end an X corresponding 
to a flawed cell. The machine to be embedded in the processing layer is 
associated with an essential machine and a class of equivalent embedded fascias. 
In considering repair of an erray, this class is restricted to embedded machines 
whose dimensions allow them to fit into the flawed array. The nature of this 
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* machine. A particular om ee 
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7 ae the. machen een adie bevalaranod ta heuctig:ony x. 
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The erray is repaired to embed that meshing, For in 
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fan ASAT TRS & 


desired arm machine with 13 cells; al ae | 


- ~ hee 


clags depends on the 


aver ae 


function states associated with a rae save, For instance, belanced states may 


Sets 


expand the size of an equvelence class a Rhgratore tciitate reper. Arms end 
trees use balanced cells to facilitate tetig and repair. The belance of ceils in 


We noted that the Liacvecbal of a an S cain 


transmission links facilitates repair of aye embedding high-relcon machines. | 
Figure 1.10.C shows a 3 x 2 grid maching | aoe in a flawed array. The 
relcon-2 nodes in the network porcespond ta traramieplen links: conndicted to grid 
Cells, in grid-embedding, celle used as licks acb bverheed sssocieted with repeir. 
For every flaw pattern and” ater essential machine, there’s an 
associated optimum repatr effictency, which fe te highest atteinable ratio of the 
number of embedded essential nodes to the pymgper of dots in the flaw pattern. in 
figure 1.10 the op tines repair etticiendy1d-6/14 for grids, 13/14 for arms, and 
14/14 tor trees. Let ORE,, ORE;, and ORE, be the optimum repair efficiencies for 
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_ Fig. 1.9 Relation Between Grids, Trees, And Arms. 


If a grid’s essential network has a certain number S of squares, all the grid’s 
» associated relten networks have at feast 5 nodes. For seach of a grid reicon 
Retwecrics W sodon thera. ork ene or ces’ iron adinatmarke wails Ni teabe N is 
greater than or -equet te S One tir nore Of the tee edietirorks of @ grids relcon 
network are arms; each fee Ivor tener rete At feast one arm hes S nodes. . 


A) The sligstest relcon network of a grid - - ite essential network 


D) Tree subnetworks of B’s reicon network 


PAGE 42 


Fig. 1.10 Repeir Of Arrays With The basing Flew eavaplh 
. A) Flaw pattern — 


se 


«eX 


B) 13~node arm in flawed array 


The repair efficiency is 13/14. 


C) 6-square grid in flawed array 


The flew pattern hae:t4:dote:::The embedded: celeen network has: 14 
used nodes, but 6 of these are essential nodes and 8 of these are 
relcon-2 overhead nodes associated with transmission links. The 
repair efficiency is therefore 6/14. 


D) 14-node tree in flawed array 


ES 


The repair efficiency is 14/14. 
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grids, trees, and arms for a even flew catterk Because of the relation between 
grid, tree, and arm machines noted in figure 1.9, OREg < ORE, s ORE; for any flaw | 
pattern. Chapters 3 and 4 explore vaie efficiency attained by programs that 
simulate repair for arms, and for grids and other high-reicon machines. Chapter 4 
compares the results of these experiments... Eaperimentel and. theoretical 
exploration of testing and repelr argue for designs oriented, when possible, toward 
limited requirements on the communication’ paths between « machine's eseentisl 
Chapter 6 summarizes the thesis, and suggests further production- 


RAS 


oriented and theoretical projects. : 
The next chapter provides context by exploring other systems 
approaches, comparing them to this pe and. spneidering evolutionary trends which 
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CHAPTER 2: CONTEXT 


Section 2.0; Introduction | 
This chapter puts this work in.context, with reepect. to relevant system 
approaches and evolutionary trends, Key , xs.of colluler: arrays. are 
discussed, and the relation of our appreach.to. theese parameters is detailed. 
Cellular arrays and conventional IC sare. compered. .Fabricgtion of cellular 
arrays on a silicon slice is shown to be: similar. to. conventiqnal fabrication of IC 
: , trends are disevesed:. rapidly 
increasing cepability of integrated circuits, incrassed.caliance on ,electrenic 


. and increasing | 


circuit “chips” on a slice, Four evolutionar, 


machines, mass-production of a few high-volume: cos anents 


regularity. Our approach is viewed as a systems approach tailored to the realities 
and trends in digital system design, menufactura,..iatributio 
Other efforts toward very high integration, testing and repeir, and callular machines 
are reviewed, and they ere compared to. our approach... 
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Section 2.1: Cellular Arrays 


2.1.A Introduction 

This section locates our cellular approach in the demain of cellular 
arrays. We focus on distinctions in srray Intercennection, custoniization, size, ‘and 
function. We briefly consider the currerit state of proposed array systems. 

The behavier of e cellular array depends ‘on the functional capability of 
its cells, and their iftterconnection. Since electronics is currently most sulted to 
implementation of e cell’s function, we describe cells using corresponding 
terminology. However, the approach applies to functionally’ equivalent @rrays 
realized in other technologies. : nee NT 


2.1.B Array Interconnection 

‘Arrays with many different types of interconnection have been studied, 
but 1- and 2-dimensional arrays ere most cothmor. ina checkerbusrd array, a cell 
may send signals to and from at most four neighbors. The cutpoint array, and 
other arrays with the same type of signal flow, have been extensively studied. 
These cutpotnt-connected arrays have the same interconnection network as a 
checkerboard array, but signals may only enter a cell from its left and upper 
neighbors, and leave the cell to enter its lower and right neighbors. We chose a 
richer interconnection structure, with its slightly higher associated cost, for several 


reasons. 


PAGE 46 


1) Most machines require fewer Celle and less -essociated delays in 
checkerboard arrays. Cutpoint-connected arrays are.limited by the fact 


that an operation on the outputs of some calls. cannot be performed 
above the lowest of these cells, or left of the rightmost. of these cells, 


without external connections. for this purpose. wd. arrays 


don’t have this limitation, because each. cell aytputs in all. four 
directions. This means, for instance, that the fegdback connections of an 
embedded sequential machine may be formed inside a.checkerboard 
array. | | 
2) Signals from an arbitrary cell in a partect,-erpitrerily. large. array 

can cause loading of an arbitrary call-in the array.only if there ig an 
| interconnection path from the loading cell to the loaded cell. This 


important capability is therefore. impossible. in | cutpaint-connected 5 


arrays. eps BAU Lexie ide : 
2) Repair is more flexible in, checkerhoecd.arays, due.te the leeger 
set of possible processing Saale states. 


_ The checkerboard array’s interce ocieo.4 structure. la bigly compatible 
with the two-dimensional, step-and-repeat nature of IC. preduction. Fustharsore, 
this structure is relatively easy to understand and map 


instance, hexagonal two-dimensional structures, 
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2.1,C Customization Techniques 

Another aspect of cellular arrays is their customization technique. All 
but the simplest arrays have the property that each cell can be customized via 
memory elements to one ef a set of function states corresponding to various 
output functions. Thus an array can be customized to realize a perticular | 
embedded machine by one of several custonization techniques. 

Uneiterable customization late in IC production is 


mask, fusible metal links, laser, or mechanical scribe. Polycell, wiere _ 
Only Memory (ROM), and Programmable Logié Arrays (PLAY provide well-kne 


"examples of this approach. Because such customization is unelterable, design or 


customization errors can be particularly disastrous. 
Programmable ROMs (PROMs) achieve greater flexibitity by allowing 
customization that is alterable, albeit currently difficult. One such technique uses 


i ‘states by appropriate 
electric signals (see <Feeney 72>). Intel guararifees éach transistor to hold its 


FAMOS transistore, which can be putin 1 of 2 conducti 


state for 10 years. High-energy ultraviolet light or x-rays can erase these 
memory elements for subsequent ré-progremming. Difficulties include the high 
voltages, long write-times, and difficult sraeing associated with the FAMOS 
transistor. Happily, Stertey Mazor of Intel éxpeets that ‘Toglé-programmable, logic- 
eraseable FAMOS transistors will be developed soon. This would provide the 


great advantage of a logic-compatibie, read-mostly, nonvolatile semiconductor 
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_ Programmable Logic (not to be confused with Programmable Logic 
3 2 ot Brogammable logic 
Re-customization of progremmeble logic is as easy es loading its function- 
specification state bits. We develop en spproech ich fajitates test and repair 
of arrays of programmable logic. Building, test wd regi mechenisms into a 
| programmable ofray can provide lower |syehim tes : Eypeir costs than those 
; stad with wy atin again cpu | 

Becayse ° practicat imptomantiainp. of bier pnmebi logic would 
probably be realized via semiconductor technology, and Heceuse eemiconductor 


Arrays) provides the ultirietie'in cus 
volatility. The arrays presented in this thesis prea ty 


memories are currently volatile, programmable logic is currently volatile. 


Development of logic-compatible nonvolatile sengandiiclor memory, such ee @ 


_ modified form of FAMOS gate, would offer big. advan logic. 


A rer dey with propane ogee ts ened ca-deare 


hon hia 


compared to metal-customized arrays. The » tye becduse there are delays 


associated “e the selection. of: function - 


Hoge e gutan (Go » figure 2.1). While 
there is no denying the diffcuty, two facts hs situation. The first is 
that the ies ‘through gates- A and 8 tale : oes be mede very smail 


ic oman 
because these elements can be designed assuming FB, a function-specification 


state bt, wil nat change state curing normal operation ofan embedded machine 
Ti mean tht eee can tote sacar that lage 


ETN BEES 


cppencetie 
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Fig. 2.1 Two Customization Techniques ... 


A) Metal~customized 


B) Programmable logic 


Sips dit RRS BG 
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gates are becoming increasingly fast, especially when designed for. 6 Knawn, simple, 
low-load environment. If Josephson junction eles become practical, the expected 
delay through a gate of about 1 neec: nthe same delay oe that through 3 om of 
wire. 

A final problem with hi cdalainesiel Bite i Ie de demand for extra # gates 
for loading function-specification state bite and ‘electing © a particular output 
function. These ules consume an integrated d circuit's aed and power. The fact 
| that extra area is required for these tates is otet somewhat by the i thet 
"programmable logic minimizes non-circut programing feiition, such as the _meny 


Brey ue $ 


| _sroe-consuming bond pad required by some custom metallization teciniguee, This 


‘ha let ed <0: 


becomes more significant as sinking trandato geometries — bond pede and 
~ ether mechanical customization components occupy a relatively higher ate oree 
The power consumption problem is tileviated by the fect that funetion- 
specification state bits change state infrequently: in some technologies, an 
| eloment’s. power chsipation is very low when the samant ls rot changing otate. 
Because of the pin constraints on Kes, most Proposals for. Joading 
programmable logic attempt loading via electric signals through leade at the edge 
of an array. Chapter 3 reviews the most attractive methods that have been 
suggested for programmable logic lenders, and fives a loading approach with 
advantages acneyed by odcing a smal emount or circuitry to each alt in the array. 
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210 Size 
Size is another dstingushing attribute of collar arrays A perticuter 
erray’s size depends on the size of each call and the number of cals The best 
measure of an IC cell’s size is the amount of area it pecyraaitgs seared this measure is 
| "too dependent on technology, designer, desis tne nd sin side to be unt in 
~ preliininery estimation of the sien of cal ‘Coeguanty, | the normal meaeure of 
| ‘dze! is gate-count of the call This measure hae  inited valve because of the 
~ Gidleble tyoek. ndmber of npuke, and dant) at sates, ond because of the tradeoff 
batweanipui-ott tee and gts fra cal performing «pera fron 
Nevertheless, several ‘authors have used ante count oa means for roughly | 
" classifying arrays (see <Minnick 67> and <Makhopedhyay 7). “They distinguish 
betwaon mirocldar ares in wich each cll coli ely few gate, nd 
macroceltular arrays, in which esch cel contsne a large )mber of gates 
"The cells presented in thie thesis use ‘tw logic elements end tow 


Ree ig a AES 


function states. The long mechani, the only mechani common to ail the celle 


| we discuss, is shown in figure 35. 1 as a minimum of sbout twenty gates and fv 
ii “memory bits, with slighty wore it loading options are ‘neorporated A processing 
* echaniém of any size tnd complexity may be combined with the loader. The 
~~ actual complenty chosen fra all dopa on the enviened aplication and on 
ie tradeoff between yield and overhead circuitry. “in the memory arrays we've 
designed, this tradeoff is the main consideration in determining how large a memory 


Spl ena oa: ‘The universal cell presented in chapter 4 wees lees than one- 
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hundred gates and memory bits, and only fourteen function state-. This simplicity 
increases cell yield and reduces test time. 

<Mukhopadhyay 71>, <Kautz 71>, and <Minnick 67> have compared the 
number of cells of different types required to perform various functions. We make 
no such comparison here, for many of the functions we perform cannot be 
performed in other proposed arrays. All our techniques are applicable to 


arbitrarily large arrays. 


2.1.E Function 

Various functional categorizations of arrays have been made. These : 
include consideration of the functional capabilities and the time behavior of cellular 
arrays. 

The most common functional classification views an array according to 
its ability to do combinational logic, memory, or more general sequential machine 
functions. <Shoup 70> discusses this in terms of "generality" of the array. Our 
‘testing and repair techniques work for any cell generality. We discuss some 
memory cells in chapters 3 and 5, and a sequential machine cell in chapter 4. 

The chapter 4 array is able to realize an arbitrary digital machine. In 
particular, the array can supports finite-configuration universal computer- 
constructor-repairer. That is, a finite number of cells can be programmed out of 
their initial quiescent states into an embedded machine able to perform any 


computation, to create a new, disjoint embedded machine able to perform any 
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‘computation, and to do these things in a ‘caaty array. The embecided machine's use 

of a loading arm and test arms allows it to test andi program its environment. it 
can, for instance, enlarge its memory by proper loading of cells. <Rowan 73> 
describes a cellular array, of more complicated cells, that Is computation-universal, 
but not capable of construction or repair. 

In a synchronous cellular array, all cell states are re-calculated 
simultaneously. Several synchronous arrays capable of supporting universal 
computer-constructors have been presented. Von Neumann's 1952 pioneering 
work, Theory of Self-reproducitng Automata, presented such a 29-state automaton (ous 
<Von Neumann 66>). <Codd 68>, <Gardner 70>, and <Banks 71> followed with 
simpler cells. While theoretically interesting, synchronous arrays are peripheral to 
this thesis because they are currently impractical. Since state changes must be 
synchronized, many technologies require long clock lines linking all cells to a 
common clock. Signal transmission is severely limited by the clock ieaietey: since 
a signal takes at least one clock interval to propagate from a cell to its neighbor. 
Thus the transmission delay through a cell in a synchronous array is at most the 
reciprocal of the toggle frequency of its memory elements, which is much slower 
than the transmission delay of one gate-delay associated with ci asynchronous 
arrays. The overhead circuitry for all proposed synchronous arrays is high. 
Testing, loading, and repair appear to be more difficult for these arrays. 

Asynchronous cellular arrays are far too numerous for extensive | 


consideration here. <Minnick 67> provides an excellent early review. In a more | 


ee Oe Oe ee ee ee ee ee ne ee aera Et tee 
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recent presentation of a theory of logic design with caller. arrays, ; 


71> summarizes and analyzes some of the major cellular arrays. <Kautz 71> 
discusses various arrays for arbitrary logic, including sequential machines. and 


special-purpose arrays; many of the designs are his own. 


_ 21F Current State 
A. arrays need, no. 
customization (Random-Access Memories), Shitt-Registers), or only a simple 


Cellular arrays are already widely used. Popular 


customization step (Read-Only Memories, Programmable Logic Arrays) (see <Luecke 
73>). There are also a few systems using many ICs, such as the liliac IV. 
However, many proposed arrays remain paper-studies for various 
_ reasons, including current Impracticality and IC industry inertia, The characteristics 
of some of these arrays have been discussed in this section, as background for. our 
approach. This approach overcomes the difficulties of many proposed cellular 
arrays. Its loading, test, and repeir circuits, and their sssociated programs, are 
compatible with many errays that have been proposed. | 
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Section 2.2: Array Fabrication 

Fabrication of LS! checkerboard arrays is similar to fabrication of 
conventional integrated circuits. In the conventional approach, hundreds of 
identical ICs ara batch processed by selective doping and metallization of a water 
that is usually 2” to 3" in diameter. Typically, a key element in this complicated 
process is use of masks to selectively expose photosensitive material on the 
wafer to light. | Each mask is formed by photographic reduction of a pattern. That 
pattern is formed by use of a step-and-repeat process which iterates a basic sub- 
pattern throughout an array. Each sub-pattern corresponds to one of the iterated 
IC’s masks. | | 

Each of a wafer’s identical circuits contains bonding pads, which are 
used for probe-testing and possible connection to the IC package. After a wafer 
has been batch-processed, each of the identical IC “chips” Is tested via electric 
communication with a test machine connected through probes to the chip. Those 
chips that are defective are inked. The wafer is diced. along horizontal and vertical 
scribe lines into component chips. Those chips that have been inked are discarded. 
The other chips are packaged and retested. Those that pass these final tests are 
ready for use. 

_ For a checkerboard cellular array, a basic circuit is similarly step-and- 

repeated to form an array of identical circuits. However, the patterns of edge- 
sharing neighbors overlap slightly to allow lines to interconnect neighbors. 


Because most of the identical circuits, cells, communicate only with their edge 


. array communication need bond Pade. 


Lo eeraiveday 
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neighbors, most do not need bond pads. Only lines thet mey be used for extre- 


aye netwoan | ggils are unnecessary, 
for the array need net be diced So ln. ry be rp of 
a wafer tak are intended to be et of { ditterent rays, 


e eventually eq nected to other, similar 
peckeges. Thus a fiven oes ite-eyle amually. incu hatch-processing with 
identical chips, seperation trom them, and eventual tion ta other chips. 


“Most conventional Kc DACKEgGS OF 


An ious akl 


| \ welll see that there are any. cvantagen hi a systems epprgach, that doesn't 
| require ‘eparate handing of wech chip, “The x ation fa now 1 required for two 


principle reasons. 7, conventional Ks Carpet Properly. if my. of thelr 


Maite 


| vers 

componente are faulty; this necessitates making @ | chip. semell enough | so that 

there's a reasonable chance it wil be perfect, Second, a system of . email IC chips 
acts phod SH he? a 


requires a varity of chips; 4 ce conta eny ne ype of cp. ee. 
- “This thesis’ caller approach lininates the need for seperation of chips 


i Moeny  ceeee. mee Congres fo Met sleeste 


S440 Of RavaY ad ABS vee ST 
machine allows testing and repair of the lcm ofthe slice can be 


ny fy) FE a Ay vil WEG tS a 


| ‘tolerated. Chapter 4 discusses such an array of of identical, m 


age 


f flexible that a large enough erray can pertern ay Cor 


with a digital 


wae 


© cells that ere so 


tA PRs ah 


| repair its caller environment. ee eae 
— _ This thesis focuses ¢ on n checkerboard a erraye derigned. tet tolerate 
: Beg Seuss PO Tg Bees OEE 


: defective cells. Colle are programmable logic foley An pach working ¢ cell, 
progranmabie funetion-spectction stete ir Specify, th the funetion. thet the cell is 
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to perform. Ofice an array has been forest electric communication of a test 


“machine with small number of cells anywhere ‘i in the erray ‘teste the entire, 


arbitrarily large erray. The testing machine uses the same 'comnnlcnion links to 
program the array to embed » periect machine, by appropriate setting of function- 
‘specification state bits. The same and/or other communication links then provide 
the inputs and sutputs of on arrey-onbeckied machine, reelized ee calis in proper 
function states. ‘Re-cuntomizing the array ie os simple ae re-loedng ie function- 
‘specification state bits. Should an array machine become defective because ofe 


its links can be used for 
testing and repelring the erray. Becaun the repair can be done electrically by « 
digit! machine, repel can be automatic, standardized, and auch, Repsir can even 
, occur through communication between an aray end a remote test machine, Indeed, 


malfunction of its circuitry, it need nat be disc Py tye 


the universal array of chepter 4 can scent embedded mechines thet test and 
reps each other. — 

The array can be viewed ae a bin of spare pets, cella For a large bin, 
‘there is a high probability that a certain percentage of perts wil be good. Thus an 
array is fabricated to have more than enough parts tor a perticulr envisioned 
application. The availability of spare perte which can be lectricaty ‘owitehed into 
- active status allows the realization of IC packages with more functional power. 
That is, higher integration is attainable through automatic repair Furthermore, this 
spare-part capability facilitates +re-custonization reliably, and maintainability 
For many ‘ees of circuit failure, an array can be re-progremmed te resume 


“ _ asic, powertul -copebltitios clesigneth iiss Witiphs, ae 
< :atltaue progremdconbralied testis, dubtitihidtten,® atl opi 14° in a “etahiderd, 
os: nedubery-maas-peuiond sqrt, he UR ol igh Wildree produced bigtuthonty 


throughout the wafer,:are moreinpertant® endl isttivBllschda’ W'S Yok 
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1 per tornani like @ perfect array. This allows graceful serration of an ervey.’ 
Simplification of system production enif yidintéhdh Spinney 


ff part® All circuitry thet 


10 cause 
array testing, customization, and repsir vie electric communicatl vill) the Colter 
erray. The standard, moduldr nétiwe of ant drray end its sie ey implies 
.-, lownteiet dhightpalisleidity: reslizathers: ot tary opto © sat giv a! 

; i aeddneithereonventional er collar IOtanticdlich, ebtli*prece stim = 


~ eaponinted: yield; ‘Thatiis, cach precessietith tende-464 ah ‘ ie per cia‘ the 


- prodast. Water pracessing: yield losiee eon aed Gs te t¥pes of 
+ datenis = aren or line detects; end epot déibite.: 8 2155) sour oes 
+ Apes ar: line defects!involve the tlustering? ot Tiilty 
_. Mthanain Khey- cone eocur- anywhere; they-wctar indeb Wegaeilly eta’ ofldete idee, 
due te:hendligg, nlaalignannt; endo tir tak tere?’ BOabaeditid F2> aad ‘that 
_,thay!re: ueuelly, ceuped:by-human errereaind thaliifri WP walldeuhitrelldd process; t 
_ Reerare end. at lithe aignitianss; ° ns Mas warigna 4ey ear 
- Spo detecte,- chat achortted: toy": reside apettetng oT bint’ tts 


xe dpeie 


feature. - Theie:-mestcortrion ‘cama’ te ‘philtobnigiivity 1808 CHoyte A>). They can 
revult -from,: for instance, @ dust partitie: betweed'® ia iid’ Ww Wafer’ ‘diling 


* of q 2 in 
eS ‘ 5 zi: am, Re Pr er me ee Perea 
: gee, Vee eee BE ea flee OE. . fo SAREE. 

i * SSHPEE HLS Spee BOA At tine LAU BSDS F088 


; se a nee iia acid tana EAT RRR RR SE! 
cP ERG NE IPR og EER TE ne a Se Ra ERS ETD SEE A SET A ee MEN eae Sere neon erep h yan oe Ba a 


ca 


_ contact printing of the wafer. 

. _._ If a wafer contained. only spet- detects, one weuld expect enexpenential 
| decrease of chip yield, as seaneuced. by the-persentagacot partect chips; es a 
__ function of active (companent-contalning) clip sree. :sbledges 72> notes. that the 

___ Yield model, ropased.by Dingwall ie mest rigsneusly-documqnbedoy: minatenturing 


_ experience, 


Y = (1+ OpAfa)? 
Y is yield, the ratio of good chips: te tetel-chige..:O, is: the numberof defects per 
-, quere inch of-slicg. A te the:sclive.sree af e-etipy imequare-inthes, in 1972, 
Hodges said that “e.Devalue of 200.per- squerminch lee qiite itypiod for-nermel 
operations by efficient producers of both bipolar: apd eilicerr<qete: dévicse’”'- aN 
Because there are othaf: process stepe-besities these for Wafer 
__ fabrication, there are. other. yield. losses in-febrication of: conventional IC system 
‘t Wafers are dropped and lncken-#0-theycasd- nev ebbeteenn protons: sites. 
Perfect chips are packaged. or bended innacrectiy.:: Fesl-pragrame end testing 
machines fail, causing. erroneous: cejection-of.wesking.(Cs. Packages’ are 
mislabeled. (One engineer told us of a packaging site putting réed-enly Wietidries 
__with different contents inthe. same: conteines; before itabelling +they were ail 
read-only memoriea) Pertect ICs are. mlewired or-brokens : 
7 Great expense is. incurred.in attempting te-minimize: eld lovtes. 
_..Menufacturers. update their. assembly. lines, etlempting.te.aahieve'a chem, efficient 
flow of materials. Circuits are designed under meny difficult constraints. A prime | 
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one is the need to minimize pins leading from the chip to the extra-IC world. This 
allows lower packaging and lead-bonding costs. We recognized the need to 
minimize pins when we developed a loader that required only a few pins. This is 
also a major reason why our testing and repair processes require access to only a 
few pins. Because yield and cost depend on many variables, including time, these 
figures can only be expected to point to some relevant considerations in iC 
production. More detailed anelysis than is appropriate here can be found in 
<Hodges 72>, <Murphy 64>, or <Seeds 67>. 2 | 


Table 2.1 Chip Yield And Manufacturing Costs 
(assuming $20 slice fabrication cost and 200 defects per sq. in) 


SSI MSI LSI 
Active area (mils) 38 x 38 68 = 68 128 x 128 
Chip size (mils) S8 x 58 88 x 88 148 x 148 
Probe yleid (percent). 88 ee |. net) 
Chips per 2 in. dia. slice 1288 . 488 158 
Good chips 968 288 22 
Cost per chip $6.82 $8.18 $6.98 
Package (DIP) 8.83 8,84 1.68 
Assembly , 8.85 8.85 8.58 
Final test 8.82 8.82 8.28 
Accumulated cost 8.12 8.21 2.68 
78% yield at final test «1.4 “1.4 x14 
Factory cost $8.17 $8.38 $3.68 


In a March 1975 letter to us, Hodges made the following points: Die cost has 
declined slightly; and inexpensive plastic packaging is now widely used for LSL 
Factory cost of LSi components is now in the range of $1 to $4. The overall 
picture reflected in the table above hasn't charged much. | 
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Key points from the Hodges paper relevant to this work include: 

1) Packaging, assembly, and test costs dominate wafer fabrication 
costs for all types of ICs, even though Hodges’ figures neglect pre- 
dicing test costs. However, wafer yield becomes increasingly important 
as integration level increases. 

2) The absolute cost of packaging, assembly, and testing becomes 
much higher as one moves to LSI. 

3) Packaging accounts for a ibictential yield loss. (Hodges is 
probably oversimplifying in assuming the same package test yield for SSI 
and LSI (see <Camenzind 72>). LS} tends to use packages with more 
pins, resulting in more packaging errors. However, tingraved assembly 
methods for LSI relative to SSI would tend to offest thia) 

Hodges makes the important point that “engineering and marketing costs 
are probably the dominant factors in the overall cost of LSI today; this will 
probably continue to be true as long as the market life of products continues to be 
short due to rapid innovation.” This factor is particularly important for custom LSI, 
where the interface between the IC house and the IC user can be very clumsy 
(see <Camenzind 72), <Mostek 73> notes that Mostek’s charges for custom 
tooling of an IC range from $20,000 to $55,000. The exact amount depends on the 
manpower and testing demands made on Mostek. Delivery time for a smail number 
of prototype units ee from 6 to 9 months after the start of the customer’s 


interface with Mostek. 
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We present a strategy for realization of large, high-yield, low-pin ICs 
via automated, electronics-oriented means. Because customization and repair are 
electric processes acting on a standard IC array whose cells are designed for 
testability and repairability, costs can be low, system production can be quick, and 

the inevitable errors resulting from a plethora of components ~ ICs, testers, test 


programs, etc. - can be minimized. 
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Section 2.3: Evolutionary Trends 

Evolutionary trends in digital systems reflect shifting realities and 
priorities important to digital systems planning. These motivate an interest in 
electronically testable, programmable, and repairable cellular arrays. Those trends 


most important to this thesis are summarized below. 


2.3.A Trend 1: Rapidly increasing capability of integrated circuits 

Table 2.2 indicates the skyrocketing performance and complexity of 
inteurated circuits (see <Aitman 74>. The table assumes a rectangular chip, and a 
“device” is a transistor. Prices have dropped with the rise in IC performance. 
<Moore 74> states: "One thing that Shockley was interested in doing was making 
the 5-cent transistor. At the time, it seemed like a goal so distant it might never 
be achieved. Many people thought the dollar transistor wasn’t in the cards. And 
now we sell transistors as parts of an IC at a very small fraction of a cent - 
probably 1/100 of a cent.” This development comes from introduction and 
refinement of various IC technologies - for instance, bipolar, MOS, magnetic 
bubbles, and Josephson technologies. Each technology has inherent physical 
characteristics that develop as it competes with other technologies. Function per 
unit area increases as function per circuit element and circuit element per area 
increase. Furthermore, reduced defect densities allow the packaging of larger 
circuit areas. In a 1973 MIT talk, Bob Noyce of Intel observed that the trends 


toward lower defect densities and higher device densities - each changing by a 
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factor of 2 every 3 years - had helped cause the doubling of the number of 
transistors in the most complex commercially available chip every year since 1959, 
when there was a single transistor per chip. 

This rapid increase in IC capability is a fundamental cause for a rapid 


increase in digital system performance (see <Turn 72> and <Kosy 72>). 


Table 2.2 IC Evolution 


Typical Industry capability 1966 1973 1988 


Maximum clockrate (Mhz) : as) 388 2688 
Transmission bandwidth (Ghz) 3 1 6 
Speed-power product (pJ) 188 3-18 el-1 
Complexity : 
Maximum chip edge length (mil) - 188 258 S598 
Device density (mi i?/device) 26-58 2-5 1-.3 | 
Maximum transistors per chip 56 5008 86288088 


2.3.8 Trend 2: Increased reliance on electronic machines 

Rapid improvements in IC capability, combined with other factors such as 
rentalism (see <Toffler 70> and <Vischi 72>) have fostered highly volatile, 
competitive IC and digital systems industries, marked by rapid product 
obsolescence and burgeoning markets. These factors have also shifted system 
design priorities toward reduction of mechanical and human labor by use of 
electronic devices. ICs are increasingly used to reduce other costs of electronic 
systems, such as interconnection costs. | 

The replacement of mechanical machines by electrical ones is pervasive 
(see <Foss 70>, <Toffler 70>, and <McLuhen 64>). The calculator, cash register, 
and watch markets provide conspicuous examples. Microlectronics provide speed, 
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size, weight, reliability, and cost advantages in these information-oriented, low- 
energy applications. 

Increasing dominance of human labor costs in total digital system costs, 
coupled with a need for shorter design and maintenance times, have spurred use 
of microelectronics to reduce labor. This is part of the society’s overall move 
toward automation. <Bell 72> observes that: “In contrast to technology, system 
' design costs have risen; this shift is demonstrated by, for instance, the decreased 
emphasis on minimization in logic design, but on the other hand, reliability, mass 
_ producibility, and maintainability are now the important design criteria.” <Franson 
74> notes that the shortage of IC designers is forcing system designers to use 
automated help. <Hodges 72> marks the dominance of engineering and marketing 
costs in the overall cost of LSI. Labor-intensive software costs increasingly 
dominate hardware costs in digital systems. Bob Lloyd of Advanced Memory 
Systems told us of the trend in the IC industry toward silicon-intensive, computer- 
intensive production. <Vischi 72> lists rentalism and “shortage of technical 
manpower and the increasing expense of salaries and training” as pressures for 
higher system reliability. 

Of course, the relative advantage of ICs in reducing other costs depends 
on the particular application. For example, the merits of reliability via extremely 
reliable components built into a redundancy-oriented system are much clearer in 
an aerospace application than in a commercial one. Table 2.3 gives the relative 


costs of efforts to insure various levels of IC reliability (see <Peattie 74>). The 
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categories, which fange from commercial to captive line, represent a spectrum 
where "the basic factors producing considerably different failure rates are several: 
the device design, the number of inline process-control inspections used, their 


level of rejection, and the degree of reliability screening.” 


Table 2.3 Relative Cost Of IC Reliability Efforts 


Part Class comm. c b a capt. tine 
Failure Rate (%/1888 hrs.) .1 .85 .886 .883 - 881 
Cost 1.86 1.3 1.8 2.8 4-6 


<Peattie 74> also gives Table 2.4, which demonstrates the wide range 
of reliability expenses which various applications demand. It’s a table showing the 
cost of detecting and removing defective semiconductor devices at four stages in 


four types of system markets. 


Table 2.4 Cost For Failure At Various System Development Stages ($) 


Market Incoming Board Mount System Test Field Use 
consumer 2 5 5 58 
industrial 4 25 4S 215 
military 7 58 128 1868 
space 15 75 388 280M 


Demand for higher integration level is consistent with the increased 
importance of ICs in the total system. Higher system integration results in fewer 
ICs to handle, test, and store. Decreased system size and weight have obvious 


storage and transportation benefits. (For instance, many IC houses ship IC slices 
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overseas for low-cost packaging.) A less expensive printed-circuit or wire-wrap 
board is requirec User interconnect costs, which may run as high as 50 cents per 
TTL gate, are’ reduced (see <Noyce 71>). “Since bond failures account for up to 
55% of all IC fiete failures, a significant reliability improvement car be made.” (See 
<Colbourne 74>) Higher integration also reduces package-related failures (Ibid). 
Improvements: in device matching, noise immunity, driver circuitry, power 
requirements, arid tranemission delay also result from higher integration, fewer 
packages, and fewer interconnects. 

There are also disadvantages to high integration levels. A higher 
number of circuit elements per input lead makes a system more difficult to test. 
<Vaccaro 74> notes "the major dilemna facing the user of LSt teday is simply that 
we can build and are building microcircuits today that are more complex than we 
can adequately test, functionally or parametrically.” <Vinson 74> notes the same 
difficulty with the increasing complexity of circuit boerds. Higher integration levels 
have implied higher investments in IC chips. For instance, estimates of the total 
development coste of intel’s 8080 microprocessor are in the million dollar range. 
This high investment places high demands on the device’s design and reliability. 
One poorly specified component can result in large losses. High carryover from 
one IC system design to its successor is necessary to reduce the staggering 
development costs. For a given technology at a given time, higher integration 
implies lower yield, because of the higher probability that one of the many 


components will be defective. 
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2.3.C Trend 3: Mass production of a few high-volume components 

The realities of LSI design, production, delivery, and maintenance argue 
for the mass-production of a few powerful, high-volume components. <Franson 
74> notes that Fairchild and National are emphasizing standard ICs over custom 
ones because “they get a better payoff for the engineering time and capital 
expenditure.” He observes that “the big IC producers aren’t after the business 
| unless the volumes are high - 100,000 per year and up.” Each time cumulated 
production of an IC doubles, its unit price tends to drop 30% (see <Luecke 73>). 
For high-volume devices, this is chiefly due to a learning curve. This learning 
curve, resulting from detective work into ways to improve production, argues for 
high-volume devices. <Vaccaro 74> cites availability and product data-base 
reasons for his conclusion that "it is clear that considerably more attention must be 
paid to selection and standardization of fewer proven device types within the 
Department of Defense.” <Vinson 74> observes the maintenance difficulties 
associated with a plethora of ICs, circuit boards, and testers. He discusses 
designing for testability and maintainability, and suggests standardization of system 
components, test procedures, and equipment. 

In his 1973 MIT talk, Bob Noyce compared the production of high- 
volume {Cs to the printing of money. In a 1974 MIT Project Mac talk, Rick Dill 
agreed; and compared the production of low-volume custom ICs to choosing 
engraving over money-production. 


<Noyce 71> observes that design of standard components with 
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capability in excess of many application needs allows the economies of mass- 
production to operate. 

In his: 1973 MIT talk, Noyce also demonstated the thirst of the IC 
industry for high-volume markets. He noted that the stl Giwie rate of micro- 
computers wae: twice that of all other computers. He observed that whereas the 
computer market was 2 x 10° gates per year, the calculator market was 5 x 10° 
gates per year and doubling every year. He estimated the watch market at 10° 
200-gate units: per year, and the smart-phone market at 10!! gates (2 x 10° 500- 
gate units) per year. Microma, a watch company, is now owned by intel. Several 
Intel people have told us of Intel’s eagerness to enter the smart-phone market. 
The 10/2/74 Boston Globe states that "National Semiconductor said it’s entering 
the electronics: timepiece market with a full line of quartz digital watches and 
solid-state alarm clocks.” 

Various options. are currently open to designers af systems with 
volumes too low or design times too short (20 weeks average for custom - EDN) 
to justify a custom IC approach. Use of standard SSI and MSI components, with 
customization via choice of interconnection of the components, is often used for 
low-volume and/or high-speed applications. High speed comes from customization 
of the system: ta the task, parallelism, and high-speed operators. Disadvantages — 
include the large number of parts required, interconnection costs, and, in many 
cases, the problems of nonstandard systems we've discussed. Another approach 


uses customization of a general-purpose computer via a memory-stored program. 
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This provides lower speeds than a hard-wired device. However, it allows use of a 
small number of high-volume components. Microprocessors are increasingly 
dominating this approach. Chapter 4 presents a cellular array that promises to be 
between microprocessor and custom-interconnect systems in performance and cost 
for many applications. | 

Rick Dill and others have asked a key question: "What should we mass-~ 
produce in LSI?" MSi and SSI have readily identifiable functions suitable for them, 
such as adders, gates, flip-flops, and shift-registers. LSI is very suited to memory, 
with its regularity, low ratio of pins to area, and high volume. Bigger, faster 
memories are evolving. They may even have added power, such as associative | 
memories. <Moore 73> indirectly observes the need for embodying low-volume 
systems in a high-volume IC: "We expect LSI to give us some very large building 
blocks, such as high-performance processors. Once that point is reached, we could 
go on to self-contained systems, but | question whether systems will be 
economical except in specific, high-volume applications.” <Noyce 71> predicts that 
in 1980 a 100,000 gate "superstar computer’s needs could be satisfied with about 
10 bipolar LSI chips. However, even if the capability exists to put all of a 
computer’s logic on a few chips, this doesn’t mean we'll be able to find a practical 
design philosophy that will permit us to do it - and that brings us back to the 
whole interconnection problem again.” That is, interconnects between IC 
components can consume large amounts of design time and chip area (80% is a 


good guess of percentage of chip area devoted to interconnects for current 
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systems - <Luecke 73>). Furthermore, the fact that several layers of metallization 
are normally associated with a higher level of integration is a prime contributor to 
low yield and reliability. These considerations lead us naturally into the next 
"evolutionary trend. | 


2.3:0 Trend 4: Increasing Regularity 

The trend toward component and system regularity is evident. Digital 
systems use far more memory bits, which are organized in a regular fashion, than 
less regular: combinational logic elements (see <Luecke 73>). Functions 
traditionally performed by “random logic” are increasingly perfermed by arrays. 
For instance, microprograms embedded in memory have taken over many 
computers’ control functions, which were formerly done by random logic. 
Programmable Logic Arrays, extensions of read-only memories that allow use of a 
reqiine array to realize complex combinational logic functions, also evidence this 
trend. Increased interest in array architectures evidence this trend on a larger 
system level. 

There are many advantages to regularity. Iteration of a simple 
component, or ceil, is consisent with mass-production. It allows concentration of 
system efforts on a component that can be optimized, rather than distributed effort 
over a collection of equally important, different components. Simplicity of a cell’s 
environment allows concentration of effort on siesign'et the cell for that 


environment. A regular environment is easier to understand and test, both for men 
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and computers. “Repetitive layout contributes to the realization of high circuit 
density per chip." (See Carr 72>.). Regularity implies constraints on metallization 
paths, which usually implies less crossover and crosstalk problems. As a 
technology improves, a more thorough carryover of investment is possible between 
designs for a regular machine. (<Lathrop 70> makes many of these points.) 

These generalities are supported by the rapid evolution of memories. 
Their regularity allows concentration on optimization of the memory cell (RAM, 
ROM, shift-register, etc.). This optimization is helped by the identical environment 
of all memory cells internal to a memory array. For instance, the impedance a 
circuit drives can be standardized and well-controlled. Once one understands how 
one cell works, it is easy to understand the entire array. Production of the array 
can be performed by replication of one of its cells. A memory array is relatively 
easy to test. Memories are so common and fundamental that every designer 
knows their important parameters. Should technology improve (by, for instance, 
improvements in yield and reduction in cell size), a basic memory cell design is 
easily adapted to a larger, denser chip. The transition from one memory array to a 
larger one is relatively easy compared to, for instance, the transition from one 


microprocessor to a larger one. 
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Section 2.4: Trends And Arrays 

The advantages that make cellular arrays interesting derive from their 
nature as an iteration of functionally, and usually physically, identical components. 
We've already explored the motivation foi the trends toward mass-produced 
standard parts and regularity. These reasons, applicable to cellular arrays, are 
summarized below. . | 7 


Advantages of mass-production of a standard part include: 
1) High engineering and marketing efforts 
expended on a high-volume part 
2) Higher uniformity in production processes 
3} Higher availability : 
4) Reduced inventories 
5) More thoroughly characterized device, which implies 
a) learning-curve-related improvements in 
design and manufacture, and 
b) Pscciity data base for reliability and fallure-mode predictions 


Advantages of regularity include: 
1) Aft advantages of a mass-produced standard part 
2) Simplified connections 
a) easier to design and produce 
b) higher component density via reduced interconnect areas 
3) Controlled, simplified environment 
a) iterated element can be optimized for this environment 
b) easier system to understand, test, and maintain 
4) Performance increase via numerical component increase, 
implying the ability to incrementally add performance 
5) Greater carryover between designs ; 


When cellular arrays are used for realization of combinational logic or 
| sequential machines, they have added benefits. Controlled customization of a 


flexible, regular, mass-produced device implies more stages of mass-production 


and a faster product development time than for custom realization of an irregular 
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Siecls aispose machine. Furthermore, arrays can provide higher parallelism than 
single-sequence computers used for the same function. 

The highly structured nature of cellular arrays also has inherent 
disadvantages. When an array is customized to behave like a particular machine, 
more components are involved than for a custom-built machine. At any time, the 
cellular array has a lower density of active components than the custom machine. 
Furthermore, the array has more parts that can fail. Of course, component costs 
are decreasingly important in digital systems. If the cellular array can be easily 
re-customized (if it’s programmable, as ours is), this added capability cannot be 
considered as useless overhead. The enormous overhead of a general-purpose 
computer processing one instruction at a time is acceptable because of its 
flexibility, generality, and computational power. Use of the simple structure of a 
regular array to facilitate test and repair ameliorates the increased probability of a 
component failure. 

A second general disadvantage of cellular arrays is “an increase in the 
length of wire through which a signal must propagate in comparison with 
conventional logic not possessing stringent interconnect pattern limitations.” (See 

<Spandorfer 68> and <Hu 73>.) Again, the enormous inter-operation delays 
associated with single-sequence computer makes this problem seem less severe 
for programmable cellular arrays. 

<Mukhopadhyay 71> mentions another problem associated with cellular 


arrays: 
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“Another major problem in cellular logic is standardization. We now 
have at hand a technology which can produce arrays of cells of very 
large complexity, but we do not yet know how to use these devices 
efficiently in practical designs. This is because of the rapid growth of 
the number of logic functions of a cell with the number of 
inputefoutputs, so that when a cell has more than 4 or 5 inputs, one 
simply does not know what to put into the cell in order to obtain a cell 
| that may-be widely used in digital circuits.” | | 
This point is evidenced by the plethora of cell-designs proposed. 

This thesis addresses this problem by identifying and satisfying 
important ootulpamabta on a cellular array. We develop mechanisms for automatic 
test, loading, and repair of cellular arrays. Verious cellular arrays incorporating 
this machinery are presented. These include memory arrays, and an array capable 
of supporting universal computer-constructor-repairers. We show practical 
advantages of these cell designs, relative to other designs, which argue for their 
realization. One such advantage is that these standard arrays can be electrically 
customized to a wide range of customer needs. Realization of and experimentation 
with a good cell design would clarify many of the issues involved with cellular 
arrays; a better understanding of the most important parameters would result. 

| <Mukhopadhyay 71> mentions another difficulty that he sees. 
“A difficult problem arises when one has to locate and correct a faulty 


cell in an array. Since the number of test points or the input/output 
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pins available is very small compared to what can be expected of a 
circuit of similar complexity built with lumped components, the 
difficulties in diagnosis problems have just been compounded. It seems 
unlikely that good and practically manageable algorithmic solutions to 
this problem will be developed in the near future because of what 
seems to be an inherent contradiction in the objective: programmability 
and flexibility which can be achieved by increasing the cell complexity. 
This implies an exponential growth of fault types and correspondingly 
astronomical growth of the number of tests to be applied to a very 
limited number of test points. The logic designer of cellular arrays will 
have to accept a certain amount of failure in the circuit and will have to 
invent synthesis procedures for fault-tolerant circuits.” 
We believe that this thesis refutes this argument. We present simple mechanisms 
which enable test and repair of an array. Particular machines which incorporate 
these mechanisms are developed. Furthermore, many existing array designs may 
be modified to incorporate our test, configuration, and repair mechanisms. 
Mukhopadhyay’s argument seems to argue, by implication, that large random-access 
memories are untestable. We assume that a cell’s behavior depends only on its 
state and inputs, and not on the state of another cell. We try to justify this 
assumption by constraining our design; for instance, our design has no signal 
busses connecting distant cells. Our independence assumption is analogous to 


testing assumptions made for current integrated circuits, including random-access 
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memories. This independence assumption enables specification of simple function 
states to allow individual test of an array’s cells via leads attached to a few cells 
anywhere in the array. Leads to one side of one cell are sufficient to allow the 
repair of most on: Our techniques allow the verification of our independence 
assumption by testing a complete embedded machine. Complete arm and tree 
machines may be easily tested via their inputs and outputs. Test links may 
connect internat points in a high-reicon machine to ceils at the edge of an array. 
Finally, many have observed that truly powerful arrays need many cells. 
Current realities of IC fabrication therefore suggest repairable cellular arrays as a 
means for achieving large arrays. As densities and yields improve, building a little 
circuitry into a cofluler array to help testing and repair becomes more appropriate. 
We focus on automatic, etectronic repair because of Its inerwesing attractiveness. 
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Section 2.5: Testing And Repair 


2.5.A Non-cellular 

The increasing difficulty of testing digital systems was noted above. 
Reasons include the plethora of digital systems and testers, their increasing 
complexity, and the decreasing ratio of test points to circuitry. At one time digital 
ICs could be tested by monitoring their output for each possible combination of 
input and internal state; Ed Fredkin recalls when many engineers insisted ICs 
required at least this much testing. This is currently impossible for most ICs, due 
to the astronomical time for such a test (2!°' tests for a serial-in, serial-out 100- 
bit shift-register). Consequently, less ambitious approaches are now used. 
Common test sequences assume one logic element’s performance is independent of 
the performance of other elements in the circuit. A gate may then be tested by 
putting its IC into a state in which that gate’s output, a function of its varied 
inputs, affects the output of the IC. This approach is especially useful in go - no 
go testing of ICs, where one only wants to know whether a circuit meets its 
specifications, and not the causes of faulty performance. Most procedures that 
must locate faults, such as maintenance procedures, assume a single fault (see 
<Marinos 71>). Because of feedback paths, sequential machines are particularly 
difficult to test. 

Development of a new testing program for each new IC or digital system 


is increasingly distressing. This has caused many to ask for a system-oriented 
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testing approach designed into digital systems (see <Vaccaro 74> and <Kautz 68>). 

Fautt-handling techniques typically use systems composed of a large 
number of components, some of which are defective. Two mejor fault-handling 
techniques are-repair and fault-tolerance. 

Repair techniques are characterized by fault-detection, followed by 
some form of ‘bypassing of faulty components; so the system’s output then 
depends only -@n the output of its good components. The best-known repair 
_ technique for: digital systems involves detection of a faulty component in a system 
designed to include only good components; a human then-eubstitutes a good 
component for .the faulty component. We focus here on appreaches which, like 
ours, allow defective components to be associated with a working system. 

Fault-tolerant hardware techniques aim at proper hardware performance 
in spite of fautts which occur during the hardware’s operation. Standby 
redundancy “employs several functionally identical modules, some being used 
actively to perferm the function, the remainder waiting to be switched in -ould 
one of the active modules fail.” (See <Carter 70>.) Masking redundancy “is 
applied to techniques which involve encoding function, active performance by all 
parts of the system, and implicit recognition of error.” (Ibid) 

. While-our approach is fundamentally a repair approach, it is compatible 
with realization of fault-tolerant machines. Cells can act either as active 
components or as standby parts in a standby redundancy system. The mechanism 
for switching cellular standby parts into active status is built into the cells of an 


PAGE 80 


array. Machines using masking redundancy can be embedded in an array. Chapter 
4 presents a cellular array consistent with such a hybrid, fault-tolerant system. 
Two machines in such an array can monitor each other. If one machine fails, the 
other can test the failed machine’s subarray; and embed a new, perfect machine in 
that array. 

<Spandorfer 68> describes discretionary wiring as one repair technique 
designed to provide large, high-yield ICs: 

"The number of basic components fabricated on a large semiconductor 

chip is larger than that needed for the desired final circuit. Each 

component has associated bond pads, which are probed during testing. 

The location of good components is used by a computer to generate 

patterns which wire only the good components into the final circuit. . . 

. Arrays make use of two metal and two insulation masks.” 

The fact that the considerable difficulties inherent in this approach wave 
even attempted points up the desirability of high-yielding high-integration ICs. 
<Foss 70 mentions some of discretionary wiring’s difficulties: 

"The layout of a discretionary wiring cell array is made very inefficient 

by redundancy and the need to allow probe testing of the individual 

cells. Although the ’yield problem’ of the logic cells is eased, the 
approach still requires 100% yield’ on the subsequent dielectric 
deposition and metallization processes. (These are normally the lowest- 


yield steps in wafer fabrication using two metallization steps. - FM) As 
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these must be faultless over a much greater area than needed for non- 

discretionary cell interconnection, it may well be found that the yield 

probiems of the two approaches (discretionary wiring and conventional) 

‘are net dissimiter.” 
<Spandorfer 68> remarks that “the key problem is the critical dependence on data 
processing for mask layout, albeit off-line, and a host of other production control 
routines per cepy. Further, use of a unique mask implies a relatively slow 
production process for a given capital investment." “In addition, each product has 
unique intercenmection and dynamic characteristics.” (See <Marvin 67>) These 
reasons indicate why discretionary wiring is now generally considered a bad way 
to achieve high integration. 

Other custom wiring techniques for using faulty components have been 
proposed. <Tammaru 67> describes use of customized board wiring to allow use 
of faulty ICs. 

<Sanders 72> describes a method that involved color-code 
categorization of partly-good memory components and their subsequent installation 
in a standard color-keyed circuit board. These boards used logic that transformed 
incoming addresses to addresses of good memory words. Texas instruments 
expects to use a similar method in late 1975 for realization of perfect bubble 
memories from imperfect components. The major difficulties with these two 
techniques are the handling of multiple part types and the need for defect 


statistics that are fairly consistent as time passes. 
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Unlike our approach, those above require perfect inter-component 
wiring. Furthermore, human and mechanical intervention are necessary to repair a 


system that develops a fault after the system is initially fabricated. 


2.5.B Cellular Arrays 
Cellular arrays have several potential testing and repair advantages. 
<Kautz 67> notes: 
"One would naturally expect that the iterative structure and the short 
internal connections of a cellular logic array would allow it to be tested 
from its edge terminals much more easily than a relatively disorganized 
interconnection of the same number of gates. If test signals are able to 
pierce the first one or two cells at the edge of the array, they can 
. probably be arranged to pierce arbitrarily deeply. Similarly, if an error 
due to a fault can be made to pass through one cell toward the output 
terminals, there is a good chance it can pass the entire route. In 
addition, the regularity and shortness of connections in the array tend to 
support the convenient assumption that the array as a whole is fault- 
free if each individual cell can be shown to be fault-free.” 
One might also expect iterative array to be amenable to simple, iterative test 
procedures (see <Thurber 69>). 
<Kautz 67>, <Menon 71>, <Tammaru 69>, and <Seth 69> study cutpoint- 


connected combinational arrays in which the output of each cell is a fixed function 
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of only that celt’s inputs. These papers concentrate on array testability and 
diagnosability: the capability, method, and time to test for, and preferably locate, 
faults of an assumed nature, via inputs and outputs at the edge of an array. A 
common assumptien is that all input Sommbinatloné to a cell must be included in 
testing of that cell (Kautz, Seth), or at least that each gate in a cell must be tested 
(Menon). | 

Our approach is quite different. We treat arrays cf more complicated 
cells - checkerbeard arrays of programmable logic. Since we don’t care about a 
cell’s response to state-input combinations it won't encounter during its operation, 
testing and repair are facilitated. An edge cell with floating inputs or outputs need 
het have these fines connected to a test machine. Instead of asking for necessary 
and sufficient cenditions on array testability or diagnesebility, we construct 
modules that convert a given array to one thet is easily tested and repsired by 
programs we describe. 

<Yau 70> also treats 2-dimensional cutpoint-connected combinational 

logic arrays. The paper presents efficient methods for adding lagic and terminals 
to each cell of an array to make the resultant, modified array diagnosable, and for 
deriving test schedules for it. The logic and terminals added depend on the 
original design; they are not standard. Test signels are routed to end from the 
edges of the array. Repair is accomplished by shorting or opening metal 
interconnections. 


<Spandorfer 65> describes repair of two-dimensional cellular arrays by 
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use of computer-determined wiring patterns which preserve the two-dimensional 
topology of the array. Custom metallization, which may pass over flawed cells, is 
used to convert a flawed array into a smaller, perfect array by proper connection 
of perfect cells. This approach assumes the interconnection network of a perfect 
array must be preserved; it does not consider the class of a machine embedded in 
the flawed array. 

<Minnick 66> describes use of custom metallization for repair of various 
arrays. He also discusses the efficiency of associated repair strategies for 
cutpoint arrays. 

The closest precursor of our testing and repair approach seems to be 
<Kukreja 73>. Testing a cell in a particular state involves including each of its 
inputs and outputs in a signal path to an edge input or output. Test signals to and 
from a tested cell are carried by cells in transmission states. Repair comes from 
programming cells in a row or column containing a faulty cell to behave like wires 
linking good cells (see figure 2.2). 

Our test and repair of arrays embedding high-reicon machines similarly 
involves use of cells in transmission states. However, there are key differences 
between our arrays and Kukreja’s. Kukreja’s 2-dimensional array is a cutpoint- 
connected array of simpler cells, in which each cell receives control variables on 
lines from a third dimension. Cells are not programmable logic cells, so there is no 
loader. Sequential machines are realized by a 3-dimensional stack of 2- 


dimensional arrays. Testing requires direct connection between a test machine and 
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Fig. 2.2 Brogrammed Array-Repair 


Flawed 3 x 3 array 


X indicates a flawed cell 
G indicates a good cell in an arbitrary state 


PAGE 86 


all the edge cells of an array. Kukreja’s approach requires far more cells and 
extra-array connections for testing of arrays, and for realization of most machines. 
Kukreja’s repair approach is what we call "simple repair"; our repair procedure is 
more complicated, but also more efficient for most checkerboard arrays. Kukreja’s 
emphasis is quite different. He does not address many of the design, testing, and 
_ repair issues we address. 

In sum, our approach is the first one we’ve seen that details LSI- 
oriented circuit modules and describes associated software for low-cost, 
automated, electrical testing, repair, and customization of cellular arrays. It is a 
systems approach whose advantages will become clearer as its description 


becomes more specific in the following chapters. 
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CHAPTER 3: ARRAY-EMBEDDED ARMS 


Section 3.0: Introduction 

This chapter presents several examples of machines that are embedded 
as arms. Since any one of a large set of loading arms may be grown and retracted 
by loading signats input to one side-set of one arm base cell, flexible loading can 
occur in a flawed’ array. Processing-layer arm machines, which are composed of 
balanced cells, can be gradually grown and tested, and snaked wound flaws in an 
array. We focus on the cell mechanisms and support programe that provide these 
capabilities. Fee specificity and practicality, we concentrate on the realization of 
highly integrated, length-programmable, computer-repairable shift-registers. 
However, our techniques apply to other arm machines. These techniques are 
easily generalized to tree and high-reicon machines. 

We present a mono-active, balanced loading mechanism for growth of 
loading arms. The loading inputs of any side-set S of a ceil are sufficient to load 
that cell’s loadur: and function-specification state bits. After the:cell is loaded, its 
loader state bits may specify which of the cell’s neighbors, if any, receives loading 
information funneled through the cell from side-set S. The loader’s balance allows 
an arm to funnel: the same command to a cell independent of the path the arm 
takes in reaching the cell. Optional cell modules extend the loader’s capability. 
The loader state may specify that an arm’s tip be re-loaded, or than an arm 


incrementally retract. A brief signal to the base of an arm can cause the arm to 
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totally retract. We use this same loading mechanism throughout our work. A 


mechanism with its capabilities is easily incorporated in arrays of two or more 


' dimensions. 


The most interesting machines that can be embedded in the processing 
layer of this chapter’s arrays are arm machines. Embedding is particularly easy for 
these machines. Typically, the loading and processing arms grow together through 
the same cells. The Array Programmer adds one cell at a time to these arms; and 
tests the new, extended arms after each extension. The Array Programmer only | 
communicates directly with the processing inputs and outputs of one side-set of an 
arm’s base cell. When the arms encounter a flawed cell, they may be partially or 
completely retracted, and grown through different cells in the array. An arm may 
be grown in any direction, avoiding flawed cells, because of the loading arm’s 
flexibility and the fact that an arm of balanced ceils is grown in the processing 
layer. Our description of the testing and repair processes depends on a model of 
flawed cells’ behavior, which we state and analyze. We study repair efficiency 
through a program that simulates repair, and suggest techniques to improve 
efficiency. We consider other issues relevant to the practical implementation of — 
our arrays. 

Repair through the interwoven processes of arm growth and testing 
contrasts to repair of high-relcon arrays. Since the requirements on the 
communication paths between essential cells of a high-relcon embedded machine 


are more stringent, repair efficiency is enhanced by the location of all the flaws in 
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an array before a global determination of a good way to repair the array. Repair 
efficiency therefore dictates that a test procedure is limited in its ability to 
predict the rote a given cell will play in an embedded machine. No matter how 
large an arm machine grows, its inputs and outputs are always at one side-set of 
its base cell. paaweves processes of machine growth and testing are hampered © 
when a machine’s growth implies a sometimes-growing number of inputs and 
outputs: the umber of connections between a test machine and a partially grown 
embedded machine is variable, and may be large. These considerations encourage 
us to test all the cells in a high-reicon array before repairing that array. However, 
the test and repair processes for high-relcon machines use the same loader 
described in this chapter, and balanced processing transmission states similar to 
the balanced states in this chapter. Many practical implementation issues are 
similar. Thus this chapter is useful in itself, and as a bridge to the next chapter. 
Since a tree machine may be embedded as an arm, it’s not surprising 
that the approach described for arm machines is readily adapted to tree machines. 
Because a tree machine may be realized by an embedded machine with any tree- 
like relcon network, including an arm network, an array embedding a tree machine 
is particularly easy to test and repair. Embedding a tree or arm machine is 


considerably easier than embedding a high-reicon machine. 
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Section 3.1: The Loader 

in programmable logic, a loading mechanism is used to load the function- 
specification state bits in a cell. Others have proposed loading mechanisms 
incorporating long, fixed, irredundant signal paths routing loading information to a 
given cell. These loading mechanism have major limitations, including susceptibility 
to catastrophic failure due to destruction of a long, critical loading line. We 
propose a method which incorporates extra logic elements in each cell to allow the 
flexible growth of a loading arm in an array. A loading arm is composed of cells in 
proper loading states, and not long lines. Since an arm may be grown from any 
cell, an entire array can be loaded via inputs to one cell in the array. 

Figure 3.1 illustrates two common methods for loading the function- 
specification state bits contained in a shift-register in each cell. Figure 3.6 
describes a shift-register’s operation. Snake and Crisscross use parallel-out shift- 
registers, with each output acting as a function-specification state bit connected to 
the processing layer. The shift-registers in Snake and Crisscross are parallel-out 
shift-registers, with each output acting as a function-specification state bit 
connected to the processing layer. These connections are not shown in figure 3.1. 
In Snake, each shift-register is part of a long shift-register that snakes through all 
the cells of the array (see <Spandorfer 65>). (If each function register is 
associated with one caniealla in different rows have different loading inputs and 
outputs; the array is strictly cellular only if we conceptually group mini “cells” into 


a macro cell.). In Crisscross, one of Wahistrom’s methods, each cell is associated 
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Fig. 3.1 Two Common Programmable Logic Loading Mechanisms 


(fhe function registers are shown without their 
outputs to the processing layer.) 
a) enake Peeers 23 ae 


B) Crisscross 


Clockl Clock2 
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with a unique clockline, dataline pair (see <Shoup 70> and <Wahistrom 69>). Each 
clockline extends through a column of cells, and each dataline extends through a 
row of cells. Co-column cells must therefore be loaded simultaneously. 

Note that these methods could operate on a more general type of array. 
For instance, function shift-registers in different cells could have different lengths, 
drive different circuitry, etc. That is, the key idea is that a loading mechanism is 
iterated through the array. The loading techniques we describe are also useful in 
this type of array, if it has two or more dimensions. 

Like the loading methods of figure 3.1, our loading method loads 
function-specification state bits into a shift-register. However, our loading method 
uses logic elements in each cell to allow loading information input to any cell to be 
routed to any other array cell that is not walled off by faulty cells. Loading inputs 
set a cell’s function-specification state bits and loader bits. The loader bits 
specify how subsequent loading information input to the cell is to be handled. 
They may, for instance, specify that it is to be routed to some neighbor-cell. 
Consequently, loading information to a cell may be routed to any cell in the array 
by a loading path, or arm, of cells in appropriate loader states (see figure 3.2). 
Loading signals input to the base of the arm can load the tip of the arm, extend 
the arm from its tip, or retract the arm. Proper use of a perfect array’s loading 
mechanism only allows the embedding of arms in the loader layer of the array. 

Our flexible loading arm has several advantages compared to the | 


loaders of Figure 3.1: 
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Fig. 3.2 A Loading Arm Grown By Array Programmer Signals 


arm's base 
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1) The methods of Figure 3.1 depend on long, inflexible signal paths. 
Each cell can be loaded in only one way, so the cell is useless if that 
way doesn’t work. A signal path connecting many cells is a weak link in 
terms of repairability; its destruction severely limits the usefulness of 
the array. Furthermore load impedance, noise, and delay considerations 
make long lines undesirable. Our loading method does not require any 
long signal lines. In some technologies, such as magnetic bubbles, all 
long lines, including power supply lines, can be eliminated from the 
array. 

2) The other methods require that many cells be loaded 
simultaneously, even if one only wants to load one. For Snake, this 
requires the reloading of an entire array even when one wants to 
change the state of just one shift-register in the array. For Crisscross, 
this requires the reloading of an entire column. In our approach, loading 
cell A from cell B requires a loading arm from A to B. If no arm already 
exists, only the cells on some path between A and B need reloading. 
Once a loading arm is formed, its tip can be easily moved around. This 
is particularly attractive because two successively loaded cells are 
issciay close to each other. 

3) Crisscross requires a loader input to the array for each row and 
each column of the array. Large arrays consequently require a large 


number of input pins and associated connections. Recognizing this 
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deficiency, <Wahlstrom 69> describes an extension of Crisscross. In 
this extension, a cell can enter a state in which a processing input is 
transmitted to a dataline above it or a clockline to its right. This allows 
loading of an arbitrary cell in an array via processing and loader inputs 
to a cell in the lower-left corner of the array. Wahlstrom admits that 
such loading is indirect and slow. Its utility is severely restricted if the 
lower-left corner of the array is faulty. Our method allows speedy 
loading of an arbitrary cell with the three loader inputs of a side-set of 
any one cell in the array. This implies that connecting the loading 
outputs of some cell in one array to the loading inputs of some cell in 
another array allows a loading arm to be extended from the first array 
into the second array. This minimizes the number of pins required for 
testing, loading, and repair in systems composed it caver ICs. 

4) For all the loading methods, one can envision a function state in 
which a cell’s processing inputs control loading lines near the cell. (A 
machine in Chapter 4 uses such a state to allow a machine embedded in 
an array to test its environment, and to construct and repair machines in 
that environment.) For the other methods, there are harsh limits on the 
position and number of cells that can be loaded from such a cell, even in 
a perfect array. With our method, any cell can be loaded from any 
other cell that is not walled off by flawed cells. 


5) A loading arm’s flexibility allows it to avoid flaws in a faulty array. 
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6) Our method allows use of several loading arms simultaneously 
loading unrelated cells. Because the other methods involve loading lines 


extending through many cells, they do not allow this. 


Our loader demands a small number of additional logic elements in each 
cell to achieve its advantages, but the cost of logic elements is declining rapidly 
compared to other system costs. 

A cell’s loading mechanism allows the loading of function-specification 
state bits in the cell. This mechanism consists of a Basic Loader, which is usually 
combined with one or more loader options. The site of loading activity in an array 
is the tip of a loading arm. The Basic Loader allows the extension of an arm to 
include any of the tip’s neighbors which aren’t already in a loading arm A Total 
Retractor option allows the rapid destruction of an arm by a single signal to the 
base of the arm. An Incremental Retractor option causes a tip’s relcon neighbor in 
a loading arm to be the new tip; the loading arm is then incrementally retracted. 
The Tip Changer option allows a tip cell to be repetitively loaded. The loader 
options demand extra logic elements, but extend the power of the loading arm. 

Each cell in a checkerboard array has Select, Clock, and Data loader 
inputs and outputs at each of the cell’s four side-sets. When an array’s power is 
turned on, its working cells are inititalized so that all their Select output lines are 
low. Raising a side-set SS,’s Select input activates SS,: Data and Clock inputs at 
SS, may load the cell’s register containing its function-specification and loader 


state bits. The newly Selected cell is the tip of some loading arm. A counter in 
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this tip cell counts the number of bits shifted into the cell’s register after 
activation; so the cell knows when its register has been loaded. The loader state 
then specifies the new erm tip. With the Basic Loader, one of the loaded cell’s 
interconnection neighbors that isn’t in an arm may have its Select input at side-set 
SS. raised. Then Clock and Data information from the base of the loading arm flow 
through the arm, through the former tip, to the new tip. This process may iterate. 
Loader options extend an Array Programmer’s ability to contro! a loading arm. 
Figures 3.3 through 3.5 give an embodiment of our loading mechanism in 
a checkerboard cellular array. Most discussion of the loading mechanism is on a 
functional level. The loader can therefore be understood without reference to 
these diagrams, but they are included for specificity. Figure 3.3 shows the 
mnemonically-initialized names of the loading inputs and outputs of a cell. The 
loader lines are Select, Clock, and Data. Up, Right, Down, and Left refer to the 
cell’s four side-sets. Figure 3.4 shows one possible realization for the Pulser in 
the lower-left corner of figure 3.5. After power is supplied to an array, the 
pulser’s OUT line in each working cell remains low long enough to assure that ail 
appropriate memory elements are simultaneously reset. Intel’s microprocessors 
have a circuit with this effect. The other elements in figure 3.5 - a complete 
functional diagram of the loader for one cell - are familiar standard logic elements 
like those in a Texas Instruments TTL catalog. The function of each logic element 
is summarized in the symbol table in figure 3.6. These elements could be realized 


in many forms and technologies. 
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Fig. 3.3 Input-output Lines Of A Cell's Loading Mechanism 


Fig. 3.4 The Loading Mechanism's Pulser 
Behavior Possible circuit 
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Fig. 3.5 Loading Mechanism With Options 
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Fig. 3.6 Symbol Table 


PAGE 100 


(1st of 2 pages) 


A) Combinational logic elements 
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B) Pulse-maker 


AND GATE: OUT = A AND B AND C. 
OUT is a logical high, or 1, if and only if 
A and B and C are all 1. 


OR GATE: OUT = A ORB ORC. 


EQUAL GATE: OUT = (A = B). 


CIRCLE: OUT = ~A. OUT is the 
complement of A. 


INVERTER: OUT = ~A. 


Although combinational logic elements ideally act instantaneously, actual 
devices involve a slight delay before a change in the inputs is reflected at the 
outputs. This delay is used in the pulse-maker. 
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PULSE-MAKER behavior. We always 


_use the Pulse-maker for inputs that 
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Fig. 3.6 Symbol Table (2nd of 2 pages) 


C) Memory elements 


D always indicates a DATA input. 
C always indicates a CLOCK input for a memory element. The memory 
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Beaiiat-onit, Outputs are QO through 
m= 1). 


This is equivalent to the shift-register 


above. 


SHIFT-REGISTER (m-bit, serial-in, serial- 
out). Only Q{m-1), the last bit of the 
shift-register, is output. 


COUNTER (m-bit). This counts in binary, 
changing state on positive-going 
transitions of C. If m = 2, the counter 
has the state-transitions for (QO QI) of: 
(0 0) to (1 0) to (0 1) ta (1 1) to (O O). 
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We first detail the Basic Loader, with none of the loader options. S-R 
. FL is the shift-register containing the function-specification and loader state bits. 
LOO and LO] are the loader state bits that specify the loader’s output side-set. 
LSTA is only included when the Tip Changer is used; this loader state bit specifies 
whether a tip cell is to be re-loaded. S-r FL may have any number of function- — 
specification state bits. Here we assume four such bits - INO, IN1, OUTO, and 
OUT1. CTR, a counter, counts the number of bits shifted into FL after a cell is 
activated. Since CTR must be able to count to P, the number of bits in FL, CTR is. 
log,P bits. The P-detector outputs a 1 when CTR’s count is P. For P = 6, the P- 
detector performs the function OUT = B2 AND B1 AND NOT BO = 6, in binary. 
TCH, the “touch” flip-flop, signals that a cell has been loaded. 

The Basic Loader ‘ used to load a perfect array in the following way. 
When power is supplied to the array, the Pulser resets CTR and TCH to 0. The 
CTR, the P-detector, and the TCH flip-flop are used to determine when a cell’s 
shift-register FL has been loaded. S-r FL is in an indeterminate state (although 
some processing layers require it to be pulser-resettable; this forces all cells into 
the same function state when power is turned on). All extra-array inputs to the 
array are O. Assume that S.LIN for some cell A goes from O to 1. Cell A has been 
TOUCHED from the left, and now its left loading inputs are ACTIVATED. C.L.IN and 
D.L.IN may now affect the cell’s function-specification and loader state bits and its 
loader outputs. This prepares shift-register FL of cell A to be loaded via D.LIN 


and C.L.IN. Since all other S.INs are 0, all other side-set’s loader inputs are not 
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activated. D.L.IN is relayed to D.OUT and C.L.IN is relayed to COUT. Besides being 
the D outputs of the cell, D.OUT is the D input to shift-register FL. Since TCH=0, 
C.OUT is the C input to shift-register FL and CTR. The first positive transition of 
C.LIN causes 

1) D.LIN to be shifted into shift-register FL, and 

2) CTR to be incremented to (BO B1 B2)=(1 0 0). 
During loading of the cell, CTR functions to count the number of positive C.IN 
transitions since the cell was touched. That is, it counts the number of bits shifted 
into shift-register FL. Succeeding C.LIN positive-transitions will similarly shift 
information into shift-register FL and increment CTR. The “p"th such transition 
(6th in this example) causes 

1) the 6th D.LIN bit to be shifted into shift-register FL, so that all the 

information in shift-register FL has been loaded from D.L.IN since S.L.IN 

went high; and 

2) CTR=(0 1 1); this causes the output of the P-detector to go high. 
Thus shift-register FL has been loaded with function-specification and loader state 
bits; the P-detector signals this fact by sending a high signal to the input to the D 
flip-flop. When C.LIN next goes from high to low, TCH goes high. This causes 

1) the C inputs of shift-register FL and CTR to remain low; C.OUT is no 

longer transferred to them; and 

2) one and only one S.OUT to go high, thereby activating inputs at the 


side-set of some “touched” neighbor cell. The one selected is 
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determined by LOO and LO1. 

The loading arm is a loading-signal path starting with some base cell 
with a high S.IN, and possibly extending from that cell to other cells, with the arm’s 
path marked by high S lines linking neighboring cells. Figure 3.2 showed one such 
loading arm. With the Basic Loader we restrict a cell from touching a cell that is in 
a loading arm. In this case, this means cell A should not touch left, the source of 
loading information. It may touch cells that are up, right, or down neighbors. 
Assume cell A touches cell B above cell A. We then say that the loading arm’s TIP 
has been moved up from A to B. This is caused by loading cell A with LOO=LO1=0; 
when cell A’s TCH goes high, its S.U.OUT goes high. That is, cell B is then touched 
by cell A. Because S.L.IN is still cell A’s only high S.IN, it’s still true for cell A that 
C.OUT=C.L.IN, and D.OUT=D.L.IN. Because cell B is the only cell that A is touching, 
cell B is the only neighbor of cell A to accept C and D information from A. B can 
now be loaded in the same way that A was. B may then touch some new neighbor 
FL, funnel loading information to this new tip, etc. That is, this process of a cell’s 
being loaded, touching a neighbor, and funneling loading information to that neighbor 
may iterate. In this way a loading arm may be snaked through an array, with its 
length only limited by the size of the perfect array. This growth of a loading arm 
to any cell from any other is facilitated by the loading mechanism’s mono-active, 
balanced nature. | 

A brief example will illustrate loading via growth of a flexible loading 


arm. Assume a perfect array was to be loaded with function states (INO IN1 OUTO 
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OUT1) equal (0 0 1 0) for cell (0 0), (1 1 1 0) for cell (0 1), (010 1) for cell (1 
0), and (1 0 1 0) for cell (1 1). The Array Programmer may connect to cell (O 0) 
in the manner shown in figure 3.8. After the array is turned on, all cells have 
CTR=TCH=0. The Array Programmer raises S.L.IN of cell (0 0), the base of the 
loading arm. The Array Programmer uses the C and D lines to clock out the 
sequence {0,0,0,1,0,0} in the manner indicated in figure 3.7. This loads shift- 
register FL with (INO IN] OUTO OUT1 LOO LO1)=(0 0 1 0 0 0). That is, the 
function-specification state bits have been properly loaded and the loader bits LOO 
and LO1 tell cell (0 0) to touch up. At TO, the first downward transition of C.L.IN 
after the loading of shift-register FL, S.ROUT of cell (0 0) is raised. Cell (0 1) is 
now ready to receive C and D information from the Array Programmer, routed 
through cell (0 0). 

The subsequent sequence clocked out of the Array Programmer via the 
C and D lines is {0,1,0,1,1,1}, {1,0,0,1,0,1}, {1,0,1,0}. Thus all the cells of the array 
have been loaded by a loading arm snaking through the array in the manner 
indicated in Figure 3.8. A different-shaped loading arm could have been used to 
accomplish an equivalent loading of function-specification state bits. 

We now consider the various options available to enhance the capability 
of the Basic Loader. The Total Retractor allows a loading arm to be grown and 
later totally retracted by a signal to the base of the loading arm. This allows, for 
instance, reloading of cells and rerouting of a loading arm to new cells from the 


same arm base. With the Total Retractor, a perfect array is loaded just as 
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Fig. 3.7 Clocking Out The Loading Sequence 0, 0, 0, 1,0, 0 
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Fig. 3.8 A Loading Arm Formed By Touching Cells 
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described above. Assume the loading arm of figure 3.8 exists. If the Array 
Programmer lowers its S line, no S.IN of cell (0 0) is high; the Total Retractor of 
cell (O 0) causes that cell to be reset to CTR=TCH=0. When TCH goes low, all 
S.OUTs of cell (0 0) go low. This resets (1 0), which resets (1 1), which resets (0 
1). The function-specification state bits of these cells are unaffected. The Total 
Retractor thus allows the resetting of all the cells in a loading arm by lowering the 
S input to the base of the arm. These reset ceils are then ready to be re-loaded 
by some new loader arm. 

The Incremental Retractor allows a loading arm to be shortened cell-by- 
ceil, instead of all-at-once as with the Total Retractor. The Incremental Retractor 
shown in figure 3.5 includes the Total Retractor circuit, so this Incremental 
Retractor is always used with the Total Retractor. The Incremental Retractor can 
save time when, for instance, one wants to change the state of a cell that is near 
the tip of a long loading arm. Consider the long loading arm of figure 3.9. If the 
Array Programmer wanted to reload cell (99 0), as it might on the basis of some 
test on cell (100 0), cell (100 0) could be loaded with information telling it to 
touch left. When S.LIN of cell (99 0) went high, the incremental retractor of ceil 
(99 0) would create a reset pulse. This would reset cell (99 0) for subsequent 
loading from (98 0). Resetting of (99 0) would lower S.R.in of (100 0), thereby 
causing (100 0)’s Total Retractor to remove (100 0) from the loading arm while 
leaving (100 0)’s function state the same. This incremental retraction is much 


faster than the equivalent action of total retraction and subsequent growth of the 
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loader arm from (0 0) to (99 0). 
In loader realizations in which the Incremental Retractor does not include 
Total Retractor circuitry, the Incremental Retractor may be used for a type of total 
ralcoction If all the other S inputs of a cell are low, lowering its S line and then 
raising it prepares the cell to be loaded from that S line’s side-set. Assume an 
Array Programmer wanted to grow a new loading arm from the base of an existing, 
unnecessary arm. By lowering the cell’s high S line, then raising it, the Array 
Programmer would prepare the cell to be loaded, forcing all the cell’s S.Out lines 
low. The cell could re-touch the cell it last touched, or touch some other neighbor, 
and a new arm could be grown from the old base. Of course, part of the old arm 
might remain in the array. Under certain conditions this is intolerable; loading a 
cell in an old arm from some new side-set involves special considerations, as we'll 
see. Nevertheless, many Seite make fast retraction feasible through 
exclusive use of the Incremental Retractor. When fast retraction is unfeasible, an 
arm may be totally retracted cell-by-cell via the Incremental Retractor, as we’ve 
discussed. If even this is impossible, due to a growth failure at the loading arm’s 
tip, the Incremental Retractor allows a new arm to grow through the working cells 
of an old arm; subsequent incremental retraction frees these working cells from 
the loading arm. 
| The Tip Changer allows a tip cell to be repetitively loaded by the same 
loading arm. This is another time-saving device, particularly helpful when one 


wants to test the same cell in various states. It involves adding an extra bit to 
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shift-register FL, and consequently requires one more clock pulse for the loading 
of a cell. If a tip cell is loaded with LSTA=1, the downward CIN transition at TO 
(directly after the P-detector goes high) causes resetting of CTR and TCH to 
CTR=TCH=0. The fact that LSTA is high aleo prevents the cell from touching any 
other cell. Thus the cell is loaded with a fonction state, remains the loading tip, 
and is therefore ready to be immediately reloaded. if the cell is loaded with 
LSTA=0, it may touch another cell as if no Tip Changer existed. 

Thue the Basic Loader can combine with a combination of Total 
Retractor, Incremental Retractor, and Tip Changer. The particular combination used 
in an array depends on the specific objectives for that array. in arrays designed 
for infrequent loading, minimization of circuitry by exclusive use of one Retractor 
option might be appropriate. The rest of this chapter details growth of shift- 
register arms. If program-variable shift-register length was important to an array, 
variation-speed considerations might encourage use of all loader eptiona: 

In summary, the fundamental loading mechanism allows loading inputs 
from. one of several sides to control loading of a cell. The cell may learn that, and 
how, subsequent information input from its active loading side-set ‘should be 
passed to loading outputs of some other side-set. If the set of loading mechanism 
neighbors is properly chosen, inputs to any cell may cause the loading of a cell 
anywhere in a perfect array, and loading of most cells in a flawed array. The 


loading mechanism may be incorporated into arrays with diverse processing layers. 
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Section 3.2: A Perfect Array Of Shift-register Cells 

We now examine one realization of a complete shift-register cell, shown 
in Figure 3.10. We'll eventually show how an array of such cells can provide large, 
highly integrated, variable-length, automatically testable and repairable memories. 
For clarity, we begin by considering an array of such cells which contains no 
flawed cells. The shift-register cell’s loading mechanism is almost identical to the 
loader of Figure 3.5. For simplicity, we assume that all the loader’s options are 
included in the shift-register cell. In fact, the approach we describe can be 
adjusted to work with just a retractor option. 

Each side-set has Select, Clock, and Data loader inputs and ouptuts, 
whose function has been described. In addition, each side-set has distinct Klock, 
iNput, and Return processing input-output lines; there is one set of K.IN, K.OUT, 
NIN, NOUT, R.IN, and R.OUT lines in each side-set. The shift-register cell could 
have been realized by disjoint loading and processing mechanisms. However, the 
cell shown in Figure 3.10 reduces circuitry and loading time by using bits in shift~ 
register FL in a dual role as function-specification and loader bits. The loader of 
Figure 3.10 Sipasbonds to that of figure 3.5 with the following mapping: 

Figure 3.5: INO IN] OUTOOUT1 LOO LO1 LSTA 

Figure 3.10; INO IN1 OUTO OUT1 OUTO OUT1 STA 
S-R FL is reset when power is turned on to limit processing layer complications of 
certain faulty cells. 


Figures 3.11 and 3.12 give alternate functional descriptions for the 
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Fig. 3.10 A Complete Shift-register Cell 
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Fig. 3.11 Abbreviating A Shift-register Cell's Function State 


Function state Abbreviation 


K.U.IN N.U.IN R.U.OUT 


The FUNCTION STATE diagram indicates the important 
processing inputs and outputs for a particular function 
state. 


The short arrow in the ABBREVIATION diagram indicates 
the active Klock input. Its side-set is nearest the base 
of a shift-register arm, 
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Fig. 3.12 Shift-register Cell's Function States 
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processing outputs in various function states. It’s apparent that the processing 
outputs depend only on the function-specification state bits, the processing state 
(shift-register A and shift-register B), and the processing inputs. Shift-registers A 
and B are of arbitrary length, with the particular practical length chosen by 
integration-level considerations discussed later. A cell has one relcon neighbor 
when STA=1; a Klock input from a side-set E clocks N.E.IN information through 
shift-register A, then through shift-register B, and finally out N.E.OUT. A cell has 
two relcon neighbors when STA=0; while K.E.IN clocks shift-register A and shift- 
register B, N.E.IN flows through shift-register A and then out some side-set F, and 
N.F.IN flows through S.R B and then out N.E.OUT. 

Cells are used to form shift-register arms. Information in an arm flows 
from the base of the arm via K and N lines to the arm’s tip, turns, and flows back 
to the base via the R lines. The cell at the tip of the arm Ra STA=1. This cell 
acts as a loop; it forms shift-register A and shift-register B into one shift- 
register, with the same relcon neighbor providing K and N inputs to this shift- 
register, and receiving the Return output of this shift-register. All non-tip cells in 
the arm have STA=0. Each of these cells receives K.IN and N.IN information from a 
relcon neighbor nearer the arm base, and transmits ROUT information to that cell. 
Each of these cells also outputs N.OUT and K.OUT to a relcon neighbor nearer the 
arm tip, and receives R.IN information from that cell. 

A simple example illustrates how the Basic Loader and Total Retractor 


allow the loading of more than one shift-register into a perfect array by use of the 
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Fig. 3.13 Loading Two Shift-registers Into Perfect Array 


_ (Bxtra-array processing lines are not shown.) 
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loader inputs to one cell. Consider the perfect array of Figure 3.13.A, with all 

cells reset because power has just been turned on. The array’s only connections 
with the outside world, other than power lines, are 

1) (0 0)’s loader inputs, which connect to the Array Programmer; and 

2) (0 0)’s and (0 1)’s N.LIN, K.L.IN, and R.L.OUT lines (not shown in the 

figure), which will provide the inputs and outputs of two shift-registers 


embedded in the array. 


After the Array Programmer raises S, the sequence {0,0,1,1}, {0,1,1,1}, 
{0,1,1,1,}, {1,1,1,1} is clocked via C and D inputs into cell (0 0). When S is 
lowered, the loading arm totally retracts. This leaves the array in the processing 
sists shown in Figure 3.13.B. S is again raised, and the sequence {0,1,1,1}, 
{0,1,1,1}, {1,1,1,1} is clocked out. S is lowered, and the array is left in the 
processing state of Figure 3.13.C. The processing lines shown are made available | 
through some type of wire. Loader lines may also be made available; so that the 
_ array may be repaired if it develops a flaw, or an embedded shift-register’s length 
may be varied. 

In estimating the time to load a cell’s p-bit shift-register FL, we 
consider two extremes: 

1) If the cell being loaded is the base of the arm, the minimum delay 
between C.IN transitions is tmin = 1/fmax, where fmax is the maximum 


clock-frequency of a shift-register. 
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2) If the cell being loaded is many celis away from the arm’s base, 
tmin ie determined by 2 factors: 
a) After a new DIN bit has been sent to the loaded cell, 
C.IN cannot go high until we're sure the D.IN bit will arrive at the 
loaded cell before C.IN's new transition. 
b) After this C.IN transition, D.IN cannot be changed until 
we're sure. the C.IN transition will definitely arrive at the loaded 
cell before the new D.IN. 


Thus the time to load a cell n cells from the base of its loading arm is the greater 
of 2 numbers. | 
tload = A x max (1/fmax, n x (dmax + cmax - dmin - emin)) 

Here dmax is the maximum delay of a D signal through a cell, and the other d and c 
_ symbols above are similarly defined. Recall that a logic gate can have a very small 
delay if it’s known that only one of its inputs changes frequently. Noting that the 
C and D delays come solely from an AND-OR function, where the ANDs have only 
she input that changes fast, we observe that dmax for a cell is approximately 
equal to dmax for a logic gate with a load of four input-loads. 

In estimating the maximum frequency at which an embedded shift- 
register may be clocked, we make two assumptions: — 

1) All bits of shift-registers A and B of a particular cell are clocked 


simultaneously. 
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2) A clockpulse remains a pulse as it travels down an arm. 


The rate-limiting delay then comes from the delay path schematized in Figure 3.14. 
tmin = andmax + 2 x andormax + s-rmax + s-rsetup 

Andmax is the maximum delay through an AND gate, where only one input to the 
gate changes often. Andormax is the maximum delay through an AND-OR gate, 
where only one input to an AND gate changes often. S-rmax is the maximum time | 
between a clock transition to a shift-register and the subsequent stabilization of 
its output at its proper value. S-rsetup is the time the D input to a shift-register 
must be stable before a clock transition. The Klock input to a shift-register arm 
must have a low enough frequency that, for the longest possible arm, a pulse 
remains a pulse as it travels down the arm; and two pulses are never less than 
tmin apart. | 

The method we’ve shown for relaying clock signals down loader and 
shift-register arms has two major disadvantages: 

1) A clockpulse may expand or contract indefinitely if it’s passed 
down a long enough arm. This limits the clock’s frequency. 

2) The frequency at which Data can be sent down the loading arm is 
limited by the uncertain delay involved in sending a Clock or Data signal 
down a long arm. Ideally a Clockpulse and its associated Data bit flow 
through an arm at the same speed. 


Figure 3.15 shows a simple circuit which eliminates these difficulties. The circuit 
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Fig. 3.14 Shift-register's Rate-limiting Delay 
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Fig. 3.15 Pulsewidth Regulator With Data Transmitter Option 
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The combination of the n-element delay with gate 2 constitutes a pulser 
responsible for outputting a pulse long enough to trigger a neighboring cell’s flip- 
flops. The m-element tapped delay in combination with gate 1 lengthens a clock 
input pulse enough to assure that the pulser acts properly. 

Assume that the delay of a signal-transition through any gate i is D plus or 
minus t. Assume that the clock input to the Pulsewidth Regulator has been O long 
enough so that the outputs of all gates are 0. First consider the Pulsewidth 
Regulator with no option. Clock input receives a positive pulse of minimum length 
P sufficient to trigger any of a cell’s attached memory elements. The m-element 
tapped delay is tapped at enough places that a clockpulse P long causes one 
longer pulse out gate 1. This longer pulse has a minimum width W such that 

W> (P+ (m+1)(0 - t) - (0 + t)] = ([P + mD - (m + 2)t). 

If W 2 n(D - t), the clock output is a pulse X with 

X 2 nD - (n + 2)t. 
Assume that the pulse at the clock output is reduced by at most R as it travels 
through gates to the clock input of a neighboring cell. Then the following 
conditions assure that the neighboring cell receives a pulse of minimum width P. 

nD - (n+ 2)t 2 P AND P+ nD - (m+ 2)t > n(D - t). 

Longer clock input pulses obviously work fine. 

A similar analysis calculates the maximum clock frequency. 

If the Data Transmitter option is used, the clock output pulse must be 
delayed the right amount to assure that a Data bit and its associated clockpulse 
flow together from one cell to the next. 
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assures that a ‘clockpulse of minimum width P which is input to a loading arm will 
travel duh the arm with a tightly-bounded width. The Data Transmitter option 
assures that a Data bit and its associated clockpulse flow together at the same 
speed down a loading arm. Placement of the Pulsewidth Regulator before the 
broadcast Kleck output of a shift-register cell would increase the maximum 
clocking frequency of a long shift-register arm. The loader’s use of the Pulsewidth 
Regulator and Data Transmitter option would speed loading for long loading arms at | 
the cost of slower loading for short arms and increased cell overhead. 

Richard Shoup’s method for forming an embedded shift-register is quite 
different from ours. In Shoup’s method, a cell contains only 1 processing layer 
shift-register; function state bits control which iNput goes to the shift-register, 
and the output of the shift-register is broadcast to all its neighbors. Clocks to the 
shift-register cells come down the Data control lines, the same lines used for 
loading function-specification state bits. This of course means that all co-column 
cells must be clocked simultaneously; they can’t, for instance, be used for 
different registers with different clock frequencies. Shoup’s arrays are relatively 
hard to test, especially for large arrays. Testing requires “building up shift- 
register paths of increasing length between opposite edges of the array.” (See 
<Shoup 73>.) Every cell is tested in all 4 directions; we'll see that this is an 
unnecessarily large amount of testing. The tester accesses the processing inputs 
" and outputs of all edge cells; this requires excessive use of probes and bonding 


pads. Our loading method gives our shift-register arrays many advantages. We'll 
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see that the fact that all communication with an embedded shift-register arm is 
through its base also facilitates testing. The major drawback of this bi-directional 
capability of our shift-register cell is that it slightly reduces a shift-register’s 


maximum clock frequency. 
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Section 3.3: Tasting And Repair 

In this section we consider the concurrent processes of testing and 
repair involved:in embedding a shift-register arm machine in a flawed array. The 
shift-register celt is. the one we've been considering, that of figure 3.10. We focus 
on growth of one arm from a base cell with loading and proceseing connections to 
an Array Progremmer, but the techniques discussed are easily generalized. The 
Array Programmer uses a toading arm to grow longer and-longer shift-register 
arms, like the twe in Figure 3.13. The growing shift-register arm extends through 
the same cells as the loading arm. The arm is tested as it grows. Failure to pass 
a test indicates: that the arm must twist through the array in a slightly different 
way, so that it includes only good cells. If an attempt is made to produce an arm 
of a certain length in a-given flawed array with inputs and outputs: at a given side- 
set, several things may happen. Such an arm may be realized, the array may be 
found incapable of producing such an arm, or testing may take too long. 

Embedding an arm of balanced cells is particularly easy. The arm is 
extended cell-by-cell into an array. When an arm is in a given pesition, the arm is 
tested under the temporary assumption that its non-tip cells will remain in their 
current function states. The reicon neighbors of each body cell are therefore 
known, and information flowing in the arm to and from its base tests the cell’s 
communication with its relcon neighbors. As long as a cell’s interconnection, non- 
reicon neighbors aren't loaded, it’s assumed that their inputs to an arm cell don’t 


change. Consequently, it’s sufficient to test an embedded arm via the inputs and 
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outputs at the base of the arm. The cells’ balance allows an arm to move in any 
direction as it snakes through good cells in an array. Sometimes extension of an 
arm in an intended direction is prevented by a flawed cell. Then the arm is 
retracted, and the arm’s growth proceeds in some new direction from some stump 
of the unsuccessfully extended arm. Since the cells in the stump of the arm stay 
in the same function state, they need not be re-tested. A cell is only tested in its 
role in an embedded shift-register arm. Thus growth of an arm through balanced 
cells facilitates the interwoven processes of testing and repair. 
Description of the testing process is much clearer if we use an example — 

simplified by some assumptions: 

1) Good cells are only loaded under the Array Programmer’s control, and 
not by signals caused by faulty cells. This assumption is satisfied if no 
faulty cell outputs a high S at the same side-set where it outputs an 
alternating C, since this is the only way a faulty cell can load a good cell. 
This assumption is also satisfied if any cell that is not loaded under the 
Array Programmer’s control is defined as a bad ceil, even if it is not the 
cell’s fault that it is improperly loaded. Since we’d like properly formed 
cells to be used as good cells, we specify cell mechanisms that help 
guarantee that a good cell is not falsely loaded. This involves making the 
set of valid loading commands smaller than the set of possible loading 
commands, so that fault-generated commands are likely to be disobeyed. 


2) A cell’s performance depends only on that cell’s mechanism, state, and 
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input signals. It does not directly depend, for instance, on the state of some 
other cefl in the array. Like the fourth assumption, this saves testing time; 
it’s used tn most IC testing programs. The assumption is reasonable because 
the only lines connecting different celis are the side-set lines and the power 
lines. This assumption accounts for side-set lines. In some technologies, 
such as magnetic bubbles, coupling could not occur through cell-connecting 
power fines because there ere no such lines. With other technologies, such 
as converitional semiconductor technology, it’s true that such coupling could 
occur. However, the regularity of an array is useful in minimizing this 
" possibility. Each cell could contain a simple regulator circuit optimized for 
the highly predictable characteristics of a working cell. 
3) Cells that are faulty during array testing must be somewhat consistent 

in their faulty behavior; that is, 

A) a ‘ances tested cell doesn’t develop new faults during array 

testing; and 

B) a processing input that doesn’t alternate during the testing of a cell 

may not alternate during other array testing, unless the Array 

Programmer commands it to alternate. 
Assumption 3 makes it easier to localize the cause of a test failure. 
Assumption 3A is reasonable because the time to test a realizable array is 
very short compared to its mean-time-between-failure. Assumption 3B 


allows a cell in an embedded arm to be tested solely for proper 
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communication with its relcon neighbors; it allows the Array Programmer to 
assume that a cell passing its tests won’t misbehave during further array 
testing due to a previously unencountered input signal. (Most cells wouldn’t 
misbehave anyway, since they’re programmed to ignore irrelevant inputs.) A 
cell may have side-sets which are inaccessible to an Array Programmer, due 
to the cell’s position near a flawed cell or at an array’s edge. All the cells 
of figure 3.16 have at least one inaccessible side-set. Assumption 3 allows 
such a cell to be embedded in an arm in spite of the inaccessibility of its 
irrelevant side-sets. Like assumption 1, assumption 3 is valid if all the 
inputs and outputs of faulty cells are assumped to be stuck at some value. 
If assumption 3 is invalid for a particular array, the Array Programmer may 
become confused during testing. In this case the Array Programmer can 
start testing the array again. Repeated confusion indicates that faults are 
forming at a pathologically high rate; then the Array Programmer signals that 
the array should be rejected. 

4) The behavior of certain mechanisms in a cell is independent of the 
state of other mechanisms. We assume that a shift-register bit works if it 
successfully accepts new information when the bit and its 2 neighboring bits 
are in any of their 2? = 8 states. This assumption implies that a shift- 
register bit’s function is independent of the state of non-adjacent bits in an 
array. This allows the testing of a length-n shift-register by testing its 


ability to shift a (10 + n)-bit sequence (0001011 100 ---), in which 
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the n bits are used to push the first ten bits through the shift-register. 
This common, reasonable assumption is necessary to save testing time; 
testing a 40-bit shift-register in every state would take a sequence of 
approximately 1,000,000,000,000 bits, and we expect ‘ cell’s shift-register 
to be considerably longer than 40 bits. The unspecified bits in the 
sequence above could be selected to drain maximum current from the 
power supply. Similarly, we astute that the processing mechanism’s — 
behavior is independent of the loader’s state. This assumption saves test 


time. 


The validity of these assumptions, which are like those made in testing 
conventional digital systems, can be made very likely by proper array design and 
layout. The ultimate test of the validity of these assumptions for a particular 
array is experimentation with that array. 

We now consider a testing process operating under these assumptions. 
Consider the array of Figure 3.16, shown in successive stages of testing. The only 
extra-array connections are the Array Programmer’s processing and loader 
connections to cell (0 1), which aren’t shown in the figure. Here we assume the 
shift-register arm is to be 5 cells long; m = 5. When a new ceil B is to be tested 
for possible addition to a shift-register arm currently extending to its tip at cell A, 
several things happen. Cell A is put in ‘; state so that shift-register arm 


information is routed to and from B. B is put into a loop state - (000 0),(0 10 
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Fig. 3.16 Growth Of Perfect Shift-register Into Flawed Array 


The connections of the Array Programmer to cell (0 1)'s 
left loading and processing lines are not shown. 


Unmarked cells are good cells in the (0 0 0 0) state. 
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1), (1 0 1 0), or (1 1 1 1) -with the loop starting and ending at A’s side-set 
shared with B. Assume N is the number of bits shifted completely through the 
processing shift-registers of cell B and passed down the arm for monitoring. Also 
assume the Array Programmer knows the contents of all the A shift-registers in 
the arm up through cell A. (The Array Programmer should know this; it’s loaded 
these registers.) Testing cell B in an arm of length m then requires (N + alength + 
m x blength) Klock inputs to the arm, where alength and blength are the lengths of 
shift-register A and shift-register B. Passing the test means that the shift-register 
arm works properly; a new tip has been properly added. If cell B is the last, 
“mth cell of the arm, the Array Programmer is then satisfied that a shift-register 
arm has been properly formed in the array. (See stages 8 and 9 in the figure.) 
The Array Programmer doesn’t care whether the cells of the arm could have been 
loaded from other directions, or would have worked in other function states. It 
doesn’t care if some cells of the array haven't been tested at all. (See cell (0 2) 
in the figure.) The Array Programmer simply cares that its objective has been 
realized. This pragmatic approach allows substantial reduction of testing time. 

if cell B is meant to be part of a longer arm, it must be connected to an 
interconnection neighbor cell, other than A, just as A was connected to B. The 
testing of this new, longer arm then proceeds as above. Growth is a recursive 
procedure. 

Failure of the arm after its extension from cell A to cell B indicates that 


growth from cell A to cell B is impossible. A may be loading B incorrectly, B may 
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be flawed, A or B may be outputting Klock information to a neighbor elsewhere in 
the arm, etc. The Array Programmer doesn’t worry about the specific nature of 
the problem. It ‘simply uses one of two reasonable flaw-models. Cell B may be 
judged as a flawed cell never to be tested again, as in the example. This 
simplification might be appropriate if the area of shift-registers A and B dominated — 
the area of a cell; failure was probably due to a failure in this area. A second 
alternative is to just consider the boundary between cells A and B impassable in 
the attempted direction. Celi B might be approached later from one of its other 
neighbors. 

If cell B can’t be approached from cell A,.some new arm-path is tried if 
there is still. one to be tried. Cell A considers touching its neighbor cells in some 
established order. When a neighbor is considered for touching, the touch is 
attempted if the cell exists (isn’t out-of bounds), isn’t known to be unloadable from 
A, and isn’t already part of an arm. Furthermore, extension of the arm through 
that cell must, at least potentially, eventually yield an arm of the desired length. 
This last provision explains why no attempt is made to include cell (2 0) in the arm 
in the example; at best a length-4 arm would result. 

If all A’s neighbors have been rejected, the arm is forced to try some 
new path that includes all arm cells up to A. In the example, (0 0) of stage 3 is 
cell A. Since there’s no cell compatible with the existing arm that can be loaded 
from (0 0), the arm is retracted to cell (0 1), where new paths are considered. 


A program simulates the method described above for loading a shift- 
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register arm into a flawed, rectangular array. The simplest fault model is used; a 
cell is either perfect or hopelessly flawed. A program forme a flaw pattern of 
specified dimensions with randomly sprinkled flaws. Another program tries to 
realize the longest arm possible in the flawed array, growing from a specified, good | 
base cell. A time timit Is used because the program would eventually consider all 
possible arm paths extending from the base cell. | 

The repair program is short and simple. ‘When an arm has grown to a 
certain tip, it tries to extend iteelt toward the nearest array ‘edge. Thus an arm 
spirals toward the center of an array in a perfect array. If no improvement in the 
maximum discovered arm is made in one-fourth of the time limit, the program looks 
at adjacent cells that are not included in this longest erm and are not known tobe 
flawed. “The progrark trise simble jogging ol the arm to include these cells. The 
~ program returns with a picture of the resulting arm in the flawed array, ond some 
statistics concerning the arm growth. ee 

Figures 3.17 and 3.18 show arms snaking through two different 25 x 25 
arrays. Statistics for these and other, similar experiments, appear in teble 3.1. 
Figure 3.19 shows graphs derived from table 3.1. 

- The experiments suggest several conclusions: 

1) For %flawed under about 25, Zoftotal drops about 2.2% when 
4flawed increases 1%. This is nearly independent of the size of the 
array, with larger arrays doing slightly better. Repair efficiency also 
drops steadily. For instance, for the unstarred 400-cell array in table 
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Fig. 3.17 Result Of An Arm-growth Experiment 
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The relcon network above shows the path of an arm after an arm-growth 
experiment. One can follow the arm’s path from its base, at (2 1), to its tip, at 
(15 18). There are 625 cells, 100 flawed cells, and 463 arm cells. We were able 
to repair the array to embed an arm with 495 cells. This suggests that our 
program’s repair efficiency can be improved. 
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The base cell is (1 1). There are 625 cells, 225 flawed cells, and 216 arm cells. 
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Table 3.1 Results Of Arm-growth Exper iments 
{let of 2 pages) 


Key: 

%flawed - flawed cells as percent of ali cells 

Cells ~ total cells in square array 

Aflaus - total flawed cells in array 

max-arm - the longest arm our program grew 

%Zoftotal - cells in longest arm as percent of all celis 

timelim - the time limit, in seconds 

Time - the time the program ran 

Zoftimelim - time as percent of timelim 

* - For tuo starred (or unstarred) arrays of the same size, 
one set of flaw coordinates is a subset of the other. 


Table: 


flawed Cells Aflaus max-arm Zoftotal timelim Times Softimelim 


8 188 8 188 180 188 1 1 

8 225 8 225 188 225 3 1 

8 488 8 488 188 488 5 1 

8 62S 8 625 188 625 8 1 

4 625 25 598 94 625 266 33 

x 4 62S 25 586 34 625 186 38 
4.44 225 18 288 32 225 «23-68 27 

* 4.44 225 16 211 94 228 74 33 
5 188 5 93 93 188 2? 27 
x 5 188 5 92 92 188027 27 
5 488 28 367 92 406 141 35 

* 5S 480 28 378 33 | 488 186 27 
8 625 58 551 88 625 216 35 

x 8 625. 58 548 88 625 175 28 
8.839 225 28 187 83 225 63 28 
* 8.89 225 28 199 83 225. 65 23 
18 188 18 85 8&5 10826 26 
«18 108—t—«éiz2K 81 81 18827 27 
18 488 48 335 84 468 §189 27 
«18 488 48 344 86 486 136 34 
12 625 7S 585 81 625 265 42 
*12 625 75 586 81 625 189 38 
13.33 225 38 168 78 22 73 33 
*13.33 225 38 182 81 225 72 32 
15 188 15 72 72 188 = 27 27 
*1S 188 1S 74 74 188° 29) 23 
1S 488 68 381 cE) 488 «112 28 


Table 3.1 Results Of Arm-growth Experiments 


%f laued 


16 
*16 

17.78 
*17.78 

28 
x28 

28 
*28 

28 
*28 

22.22 
*22.22 

24 
*24 

25 
*25 

25 
*25 

26.67 
*26.67 

28 
*28 

38 

38 
*30 

31.11 
*31.11 


35.56 
*35.56 


Cells #flaws max-arm Zoftotal timelim Time Zoftimelim 


SE SS 


(2nd of 2 pages) 


29 
31 
31 
27 
27 
73 
28 
35 
38 
29 
47 
42 
55 
27 


38 
39 
32 
51 
$3 
51 


33 
89 
88 
52 
SS 
69 
$3 
39 
45 


77 
2g 
26 
69 


PAGE 135 


PAGE 136 


Fig. 3.19 Graphs For Experiments Embedding Balanced Arms 
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3.1, the repair efficiency drops from .98 at %flawed = 5 to .415 at 
4flawed = 35. | 

2) As flawed increases, a cutoff point is reached where “Zoftotal 
drops precipitously. In our experiments, this occured for %flawed 
between 25 and 45. This cutoff occurs when an array is so flawed that 
the arm is trapped. Very small arrays, such as the 100-ceil arrays in 
our experiments, tend to have lower repair efficiencies and lower cutoff 
points; because a higher percentage of cells are edge cells. An edge is 
a barrier that restricts the growth of an arm. 

3) The time taken to embed an arm varies widely for a fixed Zflawed. 
It is roughly proportional to the number of ceils in an array, and tends to 
increase as flawed increases. When the cutoff point is reached, the 
time to embed ‘ arm plummets. This is an example where testing and 
repair time is far from growing astronomically with an array’ssize, even 
though very few input leads connect the Array Programmer and the 
array. 

4) If the active area of a cell is fixed, statistical considerations state 
that %flawed varies less as slice size (and number of cells) increases. 
This fact, the near-independence of Zoftotal on the number of cells in an 
array, the proportionality of the time to test and repair an array to its 
number of cells, and the desirability of large memories in one integrated 
circuit package all argue for fabrication of the largest possible slices. 
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5) How large should cells on ; large slice be? Assume that the 
dominant consideration is the number of bits in the largest embedded 
shift-register arm. The total number of bits in an arm embedded on a 
slice is proportional to the product of two factors: 

a) The fraction Y; of total cells embedded in the arm. Our 
‘experiments show that for a given cell yield Yo > 3/4, Y_ is 
approximately 1 - 2.2(1 - Yc). Yc is a technology-dependent 
function of defect density and cell area. 

b) The fraction of a cell’s area containing processing shift- 
registers. If a cell has P area devoted to processing shift- 
registers and V area devoted to other circuitry, this fraction is 


P/(P + V). 


We can express this product as a function of P and technology-related 
parameters. Finding the maximum of this function via differentiation 
tells us the value of P that yields the highest expected number of bits 
in a shift-register- arm. At one extreme, a large slice has nothing but 
overhead circuitry. At the other extreme, it has one large, flawed cell. 
Note that a minimum condition is that a cell be small enough to make 
%flawed below the cutoff point. This condition is now met in most 
technologies. 


Though the repair simulation program is simple, its performance is 
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encouraging. There are several ways it can be improved. In a production line 
using large stices; the program would know an expected, minimum size of an 
embedded arm for a slice of a given size. To save computer time, it could be 
satisfied when it attained that minimum-sized arm, or one slightly larger. At this 
point, use of much more computer time to maximize the arm would probably not be 
worth the cost. Our simulation ran in compiled Lisp, and no effort was made to 
improve speed. A production-oriented repair program would be carefully written 
in assembly language. More computer time could be used to improve repair 
efficiency. 

We've repaired figure 3.17’s array to realize an arm 495 cells long. , 
- Our ability to improve the repair efficiency from 88% to 94% suggests our repair 
program’s performance can also improve. A more complicated and/or heuristic 
program could improve repair efficiency. A simple extension of our program would 
be more sophisticated about jogging an arm to include unused, good cells. Even 
the current jogging procedure could be called several times, instead of only once 
at the end of the main arm-growth procedure. 

Now consider our assumption that no faulty cell outputs a high S at the 
same side-set where it outputs an alternating C. If a faulty cell outputs only 
constant signals, this assumption is obviously valid. However, this assumption is 
not valid if our assumptions are relaxed to say that all FAULTY outputs of a cell 
are stuck outputs. In particular, it is not valid if a cell A’s only fault is a high 


S.OUT - say S.LOUT. In this case incoming loading signals may be routed to the 
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falsely-touched neighbor B at the same time they are routed to the appropriate 
Array Programmer-intended cell. In this case we call cell A a branch cell, and cell B 
the branch arm’s base. This branching is particularly vexing because its effects might 
not be felt until much later in the testing. Consider the array of figure 3.20, 
whose only fault is a high S.Rin to cell (2 2). That is, (1 2) is a branch cell. In 
such an array the indicated state could occur. The Array Programmer would only 
know of the existence of the intended arm. When the intended arm tried to touch 
(0 0), the branch arm would touch (1 0), causing the subsequent test failure of the 
extended version of the intended arm that included (0 0). This failure could be 
caused by a faulty (0 0) cell, but in this case it wouldn’t have been. Even if the 
Array Programmer knew the failure was due to a loading branch, it wouldn’t know 
where the branch occurred; here cells (1 0), (1 1), (1 2), or (1 3) could have 
been branch cells. The problem is heightened by the fact that total retraction of 
the intended arm via lowering (1 3)’s S.U.In does not affect the branch arm. Indeed 
it may grow further if more loading information is clocked into (1 2). Happily, a 
working cell’s Incremental and Total Retractor circuitry implies that attempting to 
load a good ceil in a branch arm results in the freeing of all the cells from the 
loaded cell to the tip cell in the branch arm. 

There is a wide range of possible approaches to the loading branch. On 
one extreme, the Array Programmer could assume that this branching problem does 
not exist. If this assumption is invalid for a particular array, the Array Programmer 


may find itself hopelessly confused. Then it quits its testing attempts, and signals 
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Fig. 3.20 Growth of A Branch Arm 


intended arm's base 


branch arm's base 
a | 


Fig. 3.21 Branch Arm Touching Intended Arm 


Intend this Get this 


7 ae 
laa eee 
B A Cc B OA ¢ 


PAGE 142 


that the total array should be discarded. This is a fast approach that might be 
reasonable if the probability of a branch cell was low; for instance, if cells had 
many elements or arrays had few cells. 

An array can be successfully loaded even if it has a branch cell, if one is 
willing to accept the extra testing involved. By our assumption that all faulty 
outputs are stuck at some value, a branch cell can only transmit loading information 
to a branch base if the branch cell’s C output to the branch base works. Then D 
of that side-set is either: 

1) an alternating signal transferred by the branch cell, as in the 
example above; or 
2) a fixed D signal, which causes the branch base to be continually 
reset due to its being programmed into the STA=1 state. 
(2) is no problem; it’s (1) we’re considering. 

The Array Programmer can use several facts to generate a list of 
possible branch cells. When (1) holds, some of the tip end of the branch arm is a 
translated version of the intended arm. This is true because the branch base 
receives the same C and D information that the branch cell receives. The Array 
Programmer can use this information, and its knowledge of the position of the 
intended arm, to generate a list of possible branch cells. Knowledge of which cell 
of the intended arm failed helps reduce the size of this list. This knowledge may 
come from noting that all cells of an intended arm from its base through some cell 


C properly transmitted their shift-register B; the cell touched by the branch arm 
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touched cell C or the cell to the tip side of cell C. 

Assume cell A tried to touch cell B, and the subsequent test failed. The 
Array Programmer might suspect a branch cell if not even shift-register B of cell A 
((1 3) above) outputs properly during the test. A neighboring cell C, part of a 
branch arm, may have touched cell A immediately after the simultaneous loading of 
cells A and C, thereby displacing arm A’s tip to cell A. Cell A would then be loaded 
with information intended for cell B (see figure 3.21). | 

When the Array Programmer suspects a test failure occurred because 
of a branch cell, it retracts the:intended arm. The Array Programmer then regrows 
the arm through cell A, and tries. to terminate the arm with a loop at cell C, the 
possible branch cell closest to the ce! intended arm's tip. This new arm is 
tested. An unsuccessful test suggests that the potential branch arm’s bese, cell C, 
was the branch ‘che base; cell A, the branch cell, is marked as totally flawed. If 
the test is successful, and there are other potential branch cells closer to the base 
of the intended arm, these cells are tested in the same way cell A was. That is, 
arm A is retracted: and then hooked into a potential branch base ceil. This process 
repeats until all potential branch cells are tested, or a branch cell is found. If all 
tests are successful, there was no branch cell. Testing and repair continue as if 
cell A was merely unable to include cell B in the shift-register arm. In any event, 
this process assures that no branch arm remains to clutter up the array. (If such 
an arm never affects intended arm growth, we don't care about it anyway.) 


Note how the Incremental Retractor circuitry helps in the example 
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above. It allows the intended arm to touch and load a cell that’s been part of a 
branch arm. (Of course, loading must be slow enough to make negligible the 
slightly different delays of C and D information traveling through arms A and B.) 
Furthermore, it allows quick incremental retraction of arm A when a potential 
branch cell is found to be good. 

These branch location steps are illustrated in figure 3.22 for the array 
of figure 3.20. Incremental retraction is used between all the stages shown. 

In another type of possible branching, a branch cell transmits high 
S.OUTs to more than one cell AFTER the branch cell has been loaded. This type of 
branching, which is much less likely than the other, can be handled in a very similar 
way. 

Of course, various steps can be taken to reduce the probability of a 
branch cell. Instead of one S line for selection, a cell could have a larger set of 
such lines. Only the proper combination of inputs to these lines would cause a cell 
to accept loading information. This approach could make chance selection, and 
consequent branching, arbitrarily unlikely by sufficiently increasing the number of 
selection lines. . 

The Array Programmer could send to a cell loading information stating 
loading-input-direction, which the cell would compare to its Select inputs to decide — 
whether to accept a command. This technique would also help reduce the effects 
of a branch cell by reducing the ratio of valid loading commands to total loading 


commands. These techniques, and others like them, would only be employed after 
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Fig. 3.22 Location Of A Branch Cell 
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a more thorough analysis of the probability of a branch cell for a specific cell 
implemented in a specific technology. 

In loading more than one shift-register arm into an array, one must 
worry that a branch arm will destroy a shift-register arm that has already been 
formed and tested. if this possibility is sufficiently probable, it’s a good idea to 
continue testing a completed shift-register arm while a new arm is being formed. 
Effects of a branch arm can then be detected and countered before extensive 
damage to the completed arm machine occurs. Besides mnatitoelag the integrity of 
the completed arm, this approach helps limit the confusion caused by a branch arm. 

In limiting our consideration of possible failure modes to those above, 
we are encouraged by a quote from <Von Neumann 66>: 

"The axiomatization of automata for the completely defined situation is a 

very nice exercise for one who faces the problem for the first time, but 

everybody who has had experience with it knows that it’s only a very 

preliminary stage of the problem.” . . . 

"There can be no question of eliminating failures or of completely 

paralyzing the effects of failures. All we can do is to try to arrange an 

automaton so that in the vast majority of failures it can continue to 


operate.” 
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Our discussion of testing and repair shows we can achieve Von 
Neumann’s goal simply and efficiently by incorporating our loading, testing, and 
repair mechanisms into a cellular array. The major limitation of our discussion - 
the uncertainty of an appropriate flaw model - will be reduced when « perticular 


— technology and cell layout are considered for the shift-register array. 
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Section 3.4: Production And Marketing Considerations 

In previous sections we’ve considered the basic question of array 
architecture, testing, and arm growth. in this section we consider less fundamental, 
but important, points relating to specifics of production and marketing. 

Once an arm is extended slightly into an array, the arm has many 
alternate paths; the curves of figure 3.19 then apply. However, it’s critical that 
the Array Programmer be able to penetrate the array via an arm base cell. If the 
Array Programmer can only access one such cell in a flawed array, there’s a 
probability pflaw that that cell will be flawed, and the array will consequently be 
unloadable. One way to ameliorate this situation is to fabricate an array with 
Array Programmer-accessible bond pads to more than one cell - each a potential 
arm base. If there are m such cells, the probability that no arm can be extended 
into the array diminishes to about pflaw". Quick tests would establish which base 
cells worked. The Array Programmer would then use one or all of these cells as 
base cells for testing and arm growth. The base cells should probably be away 
from the edge of the array. One reason is that the edge is more subject to flaws. 
A second reason is that there are more directions for arm growth away from the 
edge. Another srneliarating solution would put a circuit on a slice that accepted 
extra-array inputs which told it which of several cell edges to logically connect to 
the slice’s leads. For instance, one “cell” would replace shift-registers A and B of 
a cell by wires. This non-cellular part of a slice would be less likely to be flawed 


than a cell. 
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Another important question relates to array size. How big should an 
array be? We know that all the procedures described so far work for arbitrarily 
_ large arrays. We've also seen many arguments for large arrays. One constraint on 
the size of arrays is manufacturing capability, which is geared toward dicing a 
wafer of maximum 3” diameter into much smaller chips. The current 100% yield 
approach has limited development of support machinery and techniques for the 
realization of very large ICs. However, Texas instruments did use a 3/2" diameter 
slice for discretionary wiring (see <Spandorfer 68>). We've also heard that 
Hughes developed 3 50-watt package for a 3” slice as part of the Navy’s Ail 
Applications Digital Computer program; unfortunately, we haven't learned any 
details about this yet. While many of TI’s and Hughes’ techniques for mounting, 
packaging, cooling, etc. can probably be carried over to large cellular arrays, that 
process may demand considerabie investment. ia ever that process will 
‘inevitably ocak. spurred by improvements in IC yields. We are not even close to 
a fundamental limit here. 

For ‘technologies that require power lines connecting many cells, 
iicveates in array size increase the probability of array-destroying power 
problems. The probability of a power bus being open-circuited can be made very 
small by making the bus wide. Layout care can lower the chance of shorts 
between a power bus and a signal line; most such shorts would probably not be 
catastrophic anyway. Nevertheless very large arrays should perhaps inciude 


protection devices in each cell or block of cells. This circuitry could cut a shorted, 
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or even overheated, cell off from its power source, before the malfunction blew 
the power line’s fuse or sucked down the power line. The protection devices 
could be a fuse, or could be semiconductor circuitry, such as common transistor- 
SCR protection circuitry. | 

In any case, the well-defined nature of the protection circuitry’s 
expected load would enable it to be very simple. Figure 3.23 schematizes a 
possible layout for power lines and protection circuits. 

Another power-handling approach would make a cell’s supply of power 
controllable by the cell’s neighbors. For instance, any of a cell’s neighbors could 
command that the cell’s power supply be switched on or off. This could save 
power in an array, and reduce the danger of faulty cells, by channeling power only 
to the cells in an embedded machine. Indeed a “power arm” could be “grown” in 
parallel with a processing arm into an initially quiescent array of cells. 

Another question relates to the size of shift-registers A and B. Having 
shift-register B longer than 1 bit helps in the monitoring of arm growth; if each 
shift-register B in an arm contains a known pattern of Os and 1s, the Array 
Programmer can monitor the position of a faulty cell by noting the location of faulty 
shift-register B output. On the other hand, a longer shift-register B demands a 
corresponding longer time to test an arm. Consequently a good length for shift- 
register B is 2 bits. Shift-register A should probably be a length consistent with 
maximum expected number of bits in a shift-register arm. 


An array yielding a maximum shift-register arm of a certain length can 
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Fig. 3.23 . Possible Layout Of Power Lines And Circuitry 
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be used to provide arms shorter than that length. This means an IC producer could 
customize the same array to various customer needs. An unusually flawed array 
could provide a small shift-register, and its package could be marked accordingly. 
Customers could even be given an IC with a variable-length shift-register whose 
length was controlled via a side-set’s loader inputs. 

If function-specification state bits are nonvolatile, a shift-register arm 
can be loaded into an array before it’s shipped to a customer. The customer has 
the option of access to loading lines, which allow him to re-program or repair an 
array. 

If the function-specification state bits are volatile, there are several 
customer-manufacturer interface options: 

1) If a customer has a computer or other appropriate digital machine, 
he has the capability for testing and programming an array. He can use 
these capabilities, and a manufacturer-supplied program, on untested or 
slightly tested (e.g., for functioning arm base cells) arrays. 

2) The customer can receive a pre-tested array and a description of 
the loading sequence required to form a specified arm in the array. 
This description could be in some non-volatile form, such as read-only 
memory, paper-tape, or paper. Loading an already-tested array is as 
easy as loading a shift-register. Power is turned on, an S line is raised, 
and [4 x (number of cells to be loaded)] bits are clocked via C and D 


lines into the array. 
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3) A communication link, terminated by logic-interface machines on 
each end, could connect the manufacturer and customer. (The link might 
be a phona line or cable.) This link could be used for loading, and even 
testing and repairing, of a customer’s machine by @ manufacturer’s or 
sects house’s computer. 

4) An array requiring very low power (such as a CMOS array) could 
be shipped around with a battery-supply. | 
In any event, a volatile array must be backed up, either by a machine 
capable of re-loading or by a power-supply insuring preservation of the function 
state of the array. 

It’s obvious that the techniques we've described for the shift-register 
arm machine apply to any arm machine. Arm machine realizations are appropriate 
to many machines which are realized as a chain of modules, with each module 
communicating with at most two other modules, and only the modules at the end of 
the chain directly connected to the machine’s inputs and outputs. Many one- | 
dimensional cetfular arrays have this characteristic, so they could be appropriately 
realized as arm machines in a flawed checkerboard array. The techniques for arm 
machines easily generalize to the high-relcon and tree machines discussed in the 


next two chapters. 
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CHAPTER 4: HIGH-RELCON MACHINES 


Section 4.0: Introduction 

This chapter discusses arrays embedding high-relcon machines. High- 
relcon machines have fewer restrictions on communication between their essential 
cells than arm and tree machines. In an arm machine, no cell may have more than 
two essential neighbors. In a tree machine, only one cell may actively Sutput 
information at a given time. All the essential cells in a high-relcon machine may 
have four neighbors, and all essential cells may be aclively communicating different 
information at the same time. High-relcon machines may therefore have speed and 
flexibility advantages. However, high-relcon machines are harder to test and 
repair because a cell may have up to four essential neighbors, and because 
essential neighbors in one high-relcon machine must be essential neighbors in all 
equivalent embedded machines. Powerful mechanisms - the loader, and balanced 
processing transmission states - allow test and repair of arrays embedding high- 
relcon machines. The description of a machine as an essential network facilitates 
repair by abstractly describing the machine in a repair-oriented way. 

High-relcon machines are conducive to a sequence in which the array is 
tested, a plan for repairing the array is developed, and the array is repaired 
through proper loading of good cells. This contrasts to the interwoven processes 
of testing and repair appropriate to. arm and tree machines. However, this 


chapter’s methods still use a loading arm for loading cells during testing and 
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subsequent repair of an array. Transmission links form test links for testing an 
array. These same transmission jinks may wire together essential neighbors in a 
machine embedded in a flawed srray. We detail the test and repair procedures 
that use these function states. Experiments with repair procedures we've written 
help us compare repair difficulties for arm and high-reicon machines, and suggest 
_ ways to improve our repair procedures. | 

Application areas most appropriate to high-relcon machines are 
considered. We -prosent a simple cell, General, which enebles realization of the | 
benefits of high-relcon machines. General may be used to realize highly parallel, 
arbitrary sequential machines, within timits set only by the size of a General array, 
ite number of input-output leads, and the speed of Its components. General 
embodies the mechanisms ‘we use to test, load, and repair high-retcon machines. A 
General array may:embed a-universal computer-tonstructor-repairer that uses the 
test and repair procedures we describe. General's loading mechanism may be 
controlled by an pcaia Atray Proghiinese’, ‘Moreover, a machine embedded 
in a General. array may be an Array Programmer; it can contro! the loading 
mechanism of cells in its environment via , function state that tranemits processing 
inputs. to one side's loader outputs. This enables a machine embedded in a 
General array to test, manipulate, and repair its cellular environment. 

For specificity, we begin by detailing the General cell. Then we 
consider a general testing and repeir approach for embedding high-reicon machines, 
and compare this approach to the one used for arm machines. We discuss 


PAGE 156 


realization issues peculiar to high-relcon machines. A comparison of the properties 
of high-relcon machines to the properties of arm, tree, and non-array machines 


reveals applications most suited to high-relcon arrays. 
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Section 4.1 The Genera! Cell 

The General cell is amenable to realization of highly parallel sequential 
machines. This cell incorporates the mechanisms essential to our testing and repair . 
approaches for high-relcon machines. Function states for processing, transmission, 
and memorization of information allow realization of an arbitrary sequential machine 
in the processing layer of an arbitrarily large checkerboard array. A Control 
function state that transmits processing inputs as loading outputs enables an 
embedded high-reicon machine to load cells in its environment. Such a machine 
may control a loading arm and four test links to test, program, and repair its 
cellular environment. Two or more such machines may monitor and repeir each | 
other. | 

Figure 4.1 gives symbols for the General cell’s function states. Like the 
cells of the last chapter, each General cell only communicates directly with its 
neighbors. or the extra-array world. There are no signal busses extending through 
a General array. We've discussed the testing and repair advantages of this type 
of cellular design. The loaders of the Shift-register and General cells are identical, 
except that esas inputs can control loading outputs when a General! cell is in 
the Control function state. Each of a cell’s four sides has S, L, and D loader inputs 
and outputs (as in figure 3.5), and a Processing input and output. Like the Shift- 
register cell, the General ceil incorporates all the loader’s options. The shift- 
register loaded by a loading arm has four function-specification state bits - FM, FO, 
Fil, and F2 - and three loader state bits - LOO, LOI, and LSTA. This shift-register 
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Fig. 4.1 General's Function States 


Function states are shown for all values of (FM FO Fl F2). 
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is reset when power is turned on. In all but one function state, only loader inputs 
affect loader outputs. However, in the Control state each Processing input ions 
one of three sides affects a different loader output at the right side: P.UIN = 
S.R.OUT, P.L.IN = C.ROUT, and P.D.IN = D.R.OUT. This state allows a machine, 
embedded in an array as a collection of function states, to re-program its cellular 
environment by appropriate processing signals transferred to some ceil’s loader 
outputs. | | 

The Cross, L-turn, R-turn, and U-turn states are types of balanced, non- 
branching transmission states. Cross is a crossover; the others are bends. We'll 
see that.Cross, L-turn, and R-turn are very useful for testing and fault-avoidance; 
note their similarity to the shift-register cell’s non-tip states (see figure 3.12). 
- Cross, L-turn, and R-turn may combine to form a tranemission link arm that snakes 
through an array. Such a link may act as a two-way wire bus, or simply as a wire 
carrying information in one direction. U-turn is useful in testing; note its similarity 
to the shift-register cell’s tip states. 

State (- 1 0 0) is a memory state. In this state, FM is not used in its 
customary function-specification state bit role; instead it’s a processing layer 
P.R.IN-selectable D flip-flop. A Reset input for this flip-flop is not provided, but 
this function is easily simulated by proper manipulation of P.R.IN and P.D.IN. This 
memory state is very convenient for realization of registers, addressable read- 
write memories, and other common memory modules. 


The states associated with F2 = 1 allow convenient realization of a 
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Fig. 4.2 A Function Performed In Different Orientations 
(first of 2 pages) 


Function F: out=(a + c) (a + b) (b + c) 
Some busses between opposite sides are not shown. 


A) Array A has inputs and output at its left. 


out 


B) Array A has its inputs and output at its right. 
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Fig. 4.2 A Function Performed In Different Orientations 
(second of 2 pages) 


Function F: out=(a + c) (a + 6) (b+ ¢) 


C) A rotated version of Array A, aided by U-turns, 
performs F with its inputs and outputs above. 
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D) A rotated version of Array A, aided by U-turns, 
performs F with its inputs and outputs below. 


This 4x3 array is the array of largest 
cells above. It is a rotated version 
of Array A. 
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combinational logic function expressed, for instance, as a minimum product of sums 
or sum of products. Figure 4.2 shows that these states function and combine very 
much like the states in programmable logic arrays. These General states, coupled 
with U-turn, were designed to eliminate the severe waste of cells that often 
results from cell designs that only operate on signals coming from a given, 
‘preferred direction. Those designs demand the use of many cells to turn an input 
signal into an appropriate orientation. Figure 4.2 presents sample realizations of a 
logic function, and indicates the ease with which General arrays operate on signals 
to or from various directions. This is particularly important for functions with many 
input-output lines. 

The fact that digital machines usually require extensive signal-routing 
explains the cell’s emphasis on bussing signals from one side to an opposite side. 
This allows a cell to perform bussing operations at some output while 
simultaneously performing a branch, combinational logic, or memory function at 
another output. | 

It’s easy to see that arbitrarily large, properly programmed General 
arrays can perform any time-independent, effectively computable computation. It’s 
been demonstrated that. today’s general-purpose computers can perform such a 
computation if their memory capacity is unlimited (see <Minsky 67>). Like <Banks 
71>, we therefore need only show the ability to realize an extensible general- 
purpose computer in the General array. The ability to realize a general-purpose 


computer comes from the availability of its basic components - Nand gates, wires, 
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and memory elements. Extensibility comes from the Control state and loading 
mechanism. An array-embedded computer can be constructed to control the 
processing inputs, and consequently the right ‘side’s loader outputs, of a Control. 
cell on a right ‘side of the computer's periphery. We've seen that appropriate 
loader signals to an arbitrarily cell allow the growth of a loading arm to an 
arbitrary cell in a perfect array. The array-embedded computer can consequently 
send signals ‘to intrease Ite’ memory as needed. 

Since such a machine has a moveable construction arm, it can construct 
arbitrary digital machines in an arbitrarily large array. For instance, it can 
construct a copy of itself. It Is therefore also a universal constructor. 

We'lt see that, for array faults of a certain assumed nature, an Array 
Programmer can test an array and embed a perfect machine in a flawed array. 
Since the Array Programmer can be realized in a ‘flawed array, the General cell 
allows universal repair for faults of an assumed nature. — | 

Thus the. General array can support a universal isis dor tenetructor: 
repairer. 

General is universal, but simple. A processing mechanism’s complexity 
results in advantages and disadvantages whose importance depends on the cell’s 
use. The need for a low proportion of flawed cells in es array embedding high- 
relcon machines currently requires that only simple celis be fabricated on a slice 
containing many cells. Basic, universal cells allow an embedded machine’s designer 
to exploit the paraifelism in a given algorithm Testing, repelr, end signel-routing 
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require cells to assume transmission states; using a very complicated cell in such 
a simple state wastes most of its complicated mechanism. On the other hand, a 
simpler cell has a smaller ratio of processing circuitry to loading circuitry; the 
simpler cell suffers from a higher associated overhead when the loading circuitry is 
quiescent. When a cell’s simplicity requires more cells for a given machine, the 
function-selection in each cell slows the machine. 

One component of a cell’s complexity is its number of processing lines. 
If a cell has many processing lines in a side-set, routing each of the lines to or 
from a different part of an array requires many cells to break the lines from the 
side-set’s bundle of lines. Furthermore, unless independence of different parts of 
cell’s processing mechanism is assumed, test time per cell rises exponentially 
with its number of processing inputs. 

An array designer considers these general considerations and specific 
design goals when designing a high-relcon array. 

General’s processing mechanism is one consistent with efficient 
implementation of our testing, repair, and computation goals. The Cross, L-turn, R- 
turn, and U-turn states are important components of test arms and transmission 
links in testing and repair. Although General cells perform wiring operations in 
many states, signal-routing is so important that expanding General’s signal-routing 
capabilities might be worthwhile. Some variation of the Control state is necessary 
for realization of our goal of array-embedded array manipulators. The sequential 


machines we envision for General would use enough memory to support a memory 
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state; corieteuetion of memory elements from gates would require a greater 
proportion of cells in an array than is justified by the resultant simplification of a 
cell. Indeed, actual applications might argue for more memory elements in a cell 
_ and/or more memory-oriented function states. It’s true that cells with (Fl F2) = 
(O 1) are rotated versions of cells with (F1 F2) = (1 1), and that cell states can be 
eliminated by clever use of the (0 1 0 1) cell. Again, these cell simplifications 
would probably rent in disproportionate numbers of cells for most applications. 

We briefly digress to give a little information about a familiar machine, a 
miniprocessor, unique only because we designed it as a machine embedded in a 
General array, end because a special feature allows it to test and repair its 
cellular environment. This miniprocessor could be the processor of a universal 
computer-constructor-repairer. This digression is intended to give some specific 
information about our cellular realization of a machine like one many readers are 
familiar with; those who aren’t will not lose continuity by jumping to the next 
section. We don’t think the General cell is particularly suited to realization of 
conventional processors, because processors are already mass-produced ICs. 
However, we do want to demonstrate the General cell’s power. Furthermore, this 
design gives some insight into the number of cells of various types needed to . 
implement a somewhat familiar machine. 

The miniprocessor we designed is a 16-bit parallel, synchronous, single- 
sequence machine with conventional A-B-C bus structure. Figure 4.3 gives a map 


of the miniprocessor. The machine has 66 extra-array lines: 1 clock, 1 interrupt, 
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Fig. 4.3 Map Of Miniprocessor-Tester-—Repairer 
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8 data inputs, 8 data outputs, 15 memory address, 1 “write memory”, 16 memory 
data inputs, and. 16 memory date output lines. The machine also has four test 
links, and one loader arm for testing and repair; we discuss use of these easily 
| implemented features in later sections. The machine's main sections are a Timing | 
and Control section, a Memory Interface section, and an Arithmetic-Logic 
Unit/Registers section. Both the Memory Interface and ALU/Registers sections 
have 16 similar modules, one for each bit-slice. The Memory Interface Section 
contains the 14-bit instruction register, and many transmission links. The 
ALU/Registers section contains six large (15 or 16-bit) registers; these are the 
Accumulator, Program Counter, instruction, Subroutine Return, Interrupt Return, and 
Input-Output/Test & Load register. This section’s 16 blocks are identical, except 
that the block intertacing with the Timing and Control Section is slightly different. 
The miniprocessor has fairly conventional arithmetic, logical, subroutine, interrupt, 
and input-output capabilities. Instructions ere processed: in a conventional, single- , 
sequence way. 

We 'specified this machine as one embedded in a perfect, rectangular 
General array with about 9,000 cells. Its non-writing indirect memory reference 
instruction takes. three cycles, with about 700 cell-delays for each cycle. Since | 
most cells introduce about one pale dalay: a cycle takes about seven microseconds 
for a technology with a gate-delay of 10 nanoseconds. Each rectangular 
ALU/Register slice gives an example of a mix of cell types; each has 18 unused 


cells, 147 transmission cells, 53 combinational logic cells, and 6 memory cells. Each 
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bit-slice has 88 essential cells: 53 combinational logic cells, 6 memory cells, 16 U- 
turns, and 13 branches. There are 118 non-branching transmission cells used as 
wires. Other parts of our processor-tester-repairer had an even higher ratio of 
wire cells to essential cells. This emphasizes the importance of good signal-routing 
capabilities in high-reicon arrays. 

Testing and repair techniques using the General cell depend only on the 
loader and processing transmission states, so the testing and repair approach for 
General can be applied to other high-relcon arrays with loader and processing 


transmission capabilities analogous to General’s. 
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Section 4.2: introduction To Testing, Construction, And Repair 

Testing, configuration, and repair for high-relcon machines is similar to 
those processes for balanced arm machines, although there are important 
differences. The chief differences are that high-relcon machines are not conducive 
to the interwoven processes of test and repair; and test and repair are more 
difficult and less efficient for high-relcon machines. We consider an approach 
applicable to any high-reicon checkerboard array with our loading arm and 
transmission link facilities. We mention how a Control state like General’s may be 
tested, but this state is not essential to our testing and repsir approach. 

In considering embedding an arm in an array, we made certain 
reasonable assumptions concerning failure modes of the array. Then the 
interwoven processes of testing and repair were considered. These processes 
occurred by the gradual snaking of an arm into an array. A cell was tested only 
insofar as necessary to establish its successful incorporation into a desired arm; 
this usually meant a cell wasn’t tested in all of its states. Testing of a new arm- 
tip cell required using a partially tested cell, but this presented no difficulty. 

In considering embedding high-relcon machines, we make assumptions 
very close to those made in the last chapter. However, most high-reicon machines 
are poorly suited to gradual growth and testing for two main reasons: | 

1) In growing an arm, the number of relevant extra-array processing 
inputs and outputs remains fixed. However, high-relcon machines 


usually have a variable, sometimes large number of relevant side-sets 
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as they're grown. Most generally, this requires test arms linking a test 
machine to the relevant side-sets at a partially grown machine’s 
periphery. This requires an Array Programmer to have a variable and 
often large number of test arms and associated links. We'd much prefer 
to have & low, fixed number of such links. Consequently, we test cells 
individually, relying on independence assumptions about cells’ behavior. 

2) Embedding an arm in a flawed array can be.done efficiently by 
gradual growth of the arm, followed by local jogging of the arm to 
include clumps of good cells. High-relcon machines benefit greatly from 
a global repair approach that begins with a description: of all the flaws 
in an array. This means that repair efficiency is improved by separation 
of the test and repair procedures. 


These considerations explain why the test.and repair processes for 
high-relcon machines are segmented into a series of several distinct. procedures. 
| First the Array Programmer’s Test procedure tests an array, noting the 
location of faulty cells. This testing is independent of the essential machine that is 
eventually embedded in the array, so Test’s results are valid until an array 
develops a new flaw. | = 

A Repair procedure determines how. to embed a pertect machine in the 
faulty array. Repair accepts a flaw pattern description of a flawed array from 
Taal, Rlepilt alta idcapke ani aenanlial:emtwack thoddl:of the sdeaifed “oosertial 
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machine. Repais’s output is a description of the repsired array that places each of 
an array’s cells into one of the following four categories: 

1) The cell is flawed. 

| 2) The cell is an essential cell. 

3) The cell may assume an arbitrary non-Control function state. None of 

its outputs is relevant to the embedded machine's output. 

4) The cell is in # Cross, L-turn, or U-turn'tranemission state, The cell 

is part of one or more wires associated with relevant inputs and 

outputs of essential cells. 

The Construct procedure constructs a perfect machine in a flawed array. 
Construct modifies Repair’s output by mapping each of an essential machine’s 
essential cell states into a properly located essential cell. Repair has arranged 
that essential cells be wired together in the proper way. Construct accepts from 
Test a model of the flawed array stating which side-sets may definitely be used 
for loading: Test develops this model as it tests an array. Every cell that Test 
finds to be good has some side-set that can be used for loading the cell. 
Construct only activates the side-sets specified by Test as it écionais a loading 
arm into an array. Construct’s loading arm may touch any good cell, but it always 
touches and loads essential cells (category 2) and wire cells (category 4). When 
Construct completes its loading task, a perfect machine is enbedded in the array. 
The embedded machine is ready tor further test or use. 


Our high-relcon repair procedure assumes that the length of wires 
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between essential cells is irrelevant to the proper functioning of an embedded 
machine. Possible techniques for assuring the validity of this assumption are 


suggested at the end of this chapter. 
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‘Section 4.3: Testing 

Testing an array embedding a high-relcon machine involves the one-by- 
one testing of the ceils in that array via test links between the tested cell and an 
Array Programmer. This procedure is relatively difficult, compared to testing of an 
array embedding an arm, because Test doesn’t know how Repair will map a perfect 
machine into the faulty array. This implies that most cells must be tested in all 
their function states. Because all of a cell’s accessible processing inputs and 
outputs may affect an embedded machine’s output, Test must vary the accessible ~ 
processing inputs to the cell, and monitor the accessible processing outputs. 
Consequently testing a cell usually involves linking each accessible side-set with 
the Array Programmer via a test link. Figure 4.4 shows that the processing 
transmission states are ideally suited for this task. 

Test makes the assumptions listed below. Each assumption is analogous 
to the ceieiberidas assumption made for shift-register cells. 

1) Good ceill’s are only loaded under Test’s control, or because of a 

branch cell, and not by signals caused by faulty cells. 

2) A cell’s performance depends only on that cell’s mechanism, state, 

| and input signals. 

3) A successfully tested cell does not develop a fault before the 

Construct ceacass is een 

4) A cell’s processing outputs don’t depend on its loader state; and, 


unless the function state is the Control state, loader performance 
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Fig. 4.4 Test Links To Processing Lines Of Tested Cell 
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doesn’t depend on the function state. This non-essential, reasonable 
independence assumption allows a reduction in testing time. Test need 


not, for instance, test a function state for all loader states. 


In considering testing, we first focus on the test stages that occur when 
all tests are passed. We then address implications of test failures, and possible 
flaw models. The modelling question is pursued in the subsequent description of 
Repair. 

Testing a cell requires explicit tests of its permissible function states, | 
and concurrent implicit tests of its loader. Tests of a typical cell involve two 
types of communication between the Array Programmer and the cells at the test 
site. Test links connect the Array Programmer to the processing inputs and 
outputs at the test site, as in figure 4.4. The test links are composed only of cells 
in the Cross, L-turn, or R-turn transmission states. The Array Programmer 
requires one test link to each accessible side-set. The Array Proavemmer 
communicates to a tested cell through signals to and from the base of each test 
link. Besides the test links, a loading arm extending to the tested region links the 
Array Programmer with loader inputs. This arm may pass through cells that are | 
also in a test link, or even through the tested cell. (However, the Array 
Programmer should not relay high processing signals down a test link connected to 
the up, left, or down side-set of a cell being loaded, and temporarily in the Control 


function state.) The Array Programmer may change the state of cells, such as the 
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tested cell, either by sending signals into the base of the loading arm, or by 
sending processing signals down three test links that converge on a Control cell. 

Testing a cell’s non-Control function states involves cycling it through 
those function states the cell may assume in an embedded machine. For each such 
state, appropriate stimulus signals, and responses to these signals, flow tiveueh: 
the test links. We'll see that Repair always specifies that a good cell adjacent to 
a hopelessly flawed cell assume a Cross, L-turn, or R-turn state; this is an 
example y the tested function states being a subset of the set of all non-Control 
function states. In this case, only some of the tested cell’s side-sets are 
accessible. A functional test of a non-Control, non-Memory function state involves 
at most 27 = 16 input combinations. Fewer input combinations may be appropriate 
if some side-sets are inaccessible, or if independence of certain outputs and 
certain inputs is validly assumed. For instance, the left Processing input might be 
experimentally found to never affect the right Processing output in the U-turn 
state, even for a faulty cell; this would allow simplified testing of the U-turn 
state. 

Testing a cell’s response time in a given state is possible, if the Array 
Programmer can accurately time a test link’s output response to an input. 
Differential techniques then allow the calculation of the delay associated with each 
test link. Additional delay comes from delay through the tested cell. 
Unfortunately, accurate timing requires time resolution of less than one gate-delay, 


which is difficult to achieve. 
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| If the: Array Programmer knows the ‘sala through each test link, test | 
time on a given function state depends on how quickly the Array Programmer can 
change the input: tora test fink. This is limited by the Array Programmer’s speed or 
the bandwidth of a cell. Any inaccuracy in the estimate of the delay through a link 
may also limit test speed by effectively reducing the bandwidth of the link. 

In testing a Control function state, test links connect the Array 
Programmer to all four of the side-sets of the cell in the Contral state. First the 
Array Programmer verifies that the right side-set’s test link is not a test arm, by 
ascertaining that. a signal into the base of the test link doesn’t return to the base 
after an appropriate delay. Then signals into the up, left, and down processing 
inputs command: the tested cell to load the cell te its right into a U-turn state. 
The Array Programmer again tests the right test link. If it’s now a test arm, the 
Control state is gaod; otherwise the Control state is bad. 

Testing a cell’s loading mechanism is implicit in the tests of the cell’s 
permissible jensen states. If a cell fails its function tests, Construct doesn’t try 
to load it. If a cell passes its function tests, a loading arm has successfully loaded 
the cell and retracted from the call. Therefore Construct’s loading arm can also 
load the cell from some side-set. Test keeps a map of which side-sets the loader 
uses to successfully activate and de-activate working cells. Construct uses this 
map to determine. the path of its loading arm. 

After a ceil has been tested, test links must be moved to a new test 


site, if there is any remaining. The new test site is usually a cell adjacent to the 
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last tested cell. Thus only the tip ends of the loading arm and test links need be 
moved. This is fairly simple, since a loading tip is at the test site. Each test link 
is gradually extended as part of a test arm, just as arms were extended in chapter 
3. After each incremental extension, all links are tested to assure growth is 
proceeding satistactorally. Since a test arm only incorporates cells in transmission 
states, faulty cells are discovered and avoided as in chapter 3. This gradual 
extension is particularly appropriate in an array with a high fault density. In an 
array with a very low fault density, the speedup from non-gradual growth could 
offset the slowdown from a faulty cell’s confusion factor. 

The process of moving the test site terminates with each of the new 
test cell’s accessible side-sets connected to a test link. The new test cell, in the 
U-turn state, is the tip of one or more test arms. The test process is repeated 
for this cell. 

In the last chapter we noted that failure after an incremental arm 
extension could mean several things. For instance, the new tip cell might be 
hopelessly flawed, or it might just be incapable of receiving information from the 
indicated direction. We noted that various flaw models might be appropriate, 
depending on the cell layout and the sophistication of the Array Programmer. 

This modelling difficulty again rises with the high-relcon array. Growth 
of test links is analogous to growth of shift-register arms, so the same comments 
apply. A similar difficulty arises when a cell is in the process of being fully tested. 


The cell may produce nonsense in all states; modelling that cell as hopelessly 
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flawed is then dafinitely appropriate. However, It may also happen that an‘ output 


value ts only ‘wrohg wher it’s a function of“a particular ‘Input coming from a cell 


output, as unusesble might be valid. Choice’ of sophistication 


cation level in the Repair 
procedure’s treatment of slightly flawed celle’ depeiids on whether the 
“sophistication is worth the computational cost. inv our’ discussion of Repsir, we 
"assume an array may be modelted by flew pattern in With every flawed! coll Ie 
"represented by anX aeons oe ee 
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Section 4.4: Repair ee ee ee ee . 
The Repair procedure determines how to embed.a.pertect, high-relcon 


_ machine in a flawed array. Test passes Repair, a flaw pattern description of the 


flawed array. Most generally, Repair may embed the largest grid machine_it can. 
; Construct may then construct in the flawed array. any,.machine with an essential 


network that fits into this largest grid. We. consider sic-e first. Most 


| actual embedded machines, have essential celle, with, Ircelevent cide-sets;. their 
essential networks are gride with squares ond linke. missing...The most, general 
. which netices. an incomplete 
grid. We eventually consider such a less general, mare, efficient, Repair procedure. 
Ite main drawback comes when @ new machine saust-be enedded in. the, flawed 
_ array, If the new machine's onsen catwork ie. suonetwork of the riginl 


_ Repair method is then less efficient than @ methac 


_ essential network, new repair of the array is necessary. Repair decides how to 
locate and wire together good essential cells, using only goed.celle in transmission 


states as wires, to embed a perfect machine in a., awed 


erray, This. allows 
Construct to associate the. proper function. state with.eachessential cell, and to 
wire together essential cells with transmission, states.dictated by Repair, 

| The Repair procedures that wa,have written aagume the simples}, fault 
_ model: a cell is either good. or hopelessly flawed... This. conservative assumption is 
most questionable, because of its harshness, when, Test finds aside between cells 
cell A or cell B is hopelessly flawed. Repeir knows, that en. untlawed cell. should 


_A and B is impassable. This condition can be safely med 


weap: Big ee Goes ee pe aa Pee oe ES Fis. ogee | keer i ge tS BOSE St ees 
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not allow a faulty cell’s ouput to affect an embedded machitie’s output. Thus no 
output-affecting signal will be transriitted across thé feulty'side. In all other cases 
where: a cell displays ‘tome flilty behavior, is méilatied esa hopelessly flawed 
cell. ate 7 ae ae - fs F a ae “ops Mah on BE eae) . 


Consider two ‘checkerboard arrays with the sane distribution of good 
and bad cells, and consequently the satne fthw pattern ‘The first array is for 


ne en ‘arm machine. Since 


is for embed 


embedding a grid machine, and the secénd 
Sherk ace iy wa an Gin ban Wh roi iti dele noden ‘ofan the 
grid’s reicon networks; ‘the longest arr d arm in the second array coritains at 


in the first ‘array. 
Embecding a gid in a fewed array invilves using’ sone’ goad calls purely links 
: pésition in the second 
* array can be used as ann’ Cells; because cells in links have relcon es Nigh as cells 
~In-arms. Thus optimum repair efficiency for the arm i nitine Ie 
optimum repair efficiency for the grid-machitie, givin the same flaw pattern. 

How'd the optimum efficiencies compere? Ativwering this question. 


- feast as many essential cells as the largest grid embedided 


between essential ‘neighbors. Calls in the’ “eorre ee Suet 


We at tacet as Figh os 


from a non-experitrental, purely“ mathematical perspective’ iipears véry difficult. 
An analytic, tractable expression for optimum repel? ‘efficiency, given a particular 
flaw pattern, appears impossible for most cases. An expression for average 
efficiency, ‘averaged over sit flew distributions for a givin number of flawed cells 
in afy array of a certain size, also appears impossible tor both arms and grids. 
Although one might find éome lower bounds'for repel’ efficiency, It’s likely ‘thet the 
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bounds would not be close enough to the optimum to be practically interesting. 
Furthermore, one would still have little knowledge of the difficulty of attaining or 
surpassing a lower bound in an actual Repair procedure. 

Consequently, our approach has been to write promising repair 
procedures, observe their behavior, and use our observations to suggest 
improvements in the procedures. Some of these suggestions are implemented, and _ 
the process repeats. 

Many actual essential machines contain a mixture of low-relcon and 
high-relcon essential cells. Figure 4.5 gives the relcon network for our embedding 
of one bit-slice of the ALU/register section of our processor-tester-repairer in a 
perfect array. The upper-right region of the bit-slice has many high-relcon 
essential celis, and has few links to nodes outside the region. On the other hand, 
the bit-slice has many relcon-2 chains, balanced arms, and even reicon-O cells. 
Many relcon-2 and relcon-4 cells are used as a wire or crossover. 

In embedding the bit-slice in a flawed array, we could approximate its 
essential network by a grid. Adding constraints to Repair in this way would have 
three major effects: 

1) It would simplify the description of the slice’s essential network. 

2) It would make Repair’s results valid for ‘iy machine that fit into a 
perfect 7 x 32 array. 

4) It would diminish Repair’s efficiency. 


In this section, we first consider grid-embedding - the most difficult, general repair 


cea aro nae rete ee on ee 
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Fig. 45 Relcon Network For One. ALU/Register Bitrslice 


In the rectangular bit-slice there are 224 total cells: 18 relcon-O cells, 16 
relcon-1 cells, 78 reicon-2 cells, 28 relcon-3 cells, and B47 }: Calle. The 
bit-slice’s relcon network represents a compact embedding of a machine with 88 
essential cells: 53 combinational. logic cella,.6 mer Cells, 16 U-turn cells, and 
13 branch celle. TIB of the retcoh-2 arid relcon-4 celle are nen-branching 
transmission cells, which are used as wires. Other part; of ee: <processor-tester- 
~vepeirer had ati éven higher ratio of Wire Cells te Sbtential celle.” 
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in a checkerboard array. We compare, @ 0 arm-embedding. We 


Boon Page 2a 


then suggest an approach which improves embedding efficiency:-by. noticing missing 
links between nodes in a high-reicon machine's essential.natwork. This type of 
It’s usually difficult. to optimally embed-e .grid ina flawed erray.. if the 
there. ere. few reasonable 
_ ways to interconnect good cells to form,a-large grid, As.the number of flaws in 
the array increases, the number of reasonable. ways to:form a Jarge grid explodes. 
Repair cannot consider all possible embeddings; this would teke. too .much 
computation. The obvious, simple repair methodewe've. applied to these. ink 
_ don’t work well. Eventually there are so many flewe.in.an array that the 


embedding problem is easy,-because, it’s obvious that:ne.grid:can be embedded in 


approach.is the most feasible for most. 


array has very few flaws, Grid. Repair is easy. Cat 


the array. 


We focus on the most. difficult grideembedding flaw.region. We present 
@ reasonable approach which is: considerably. mare.sophisticated. than. the only 


similar approach we’ve..seen, which is Kukreja’s repair.of: cutpoint-connected. 
arrays. es ee : | 
The nucleus of Grid Repair is a Tusst, Reputr precpcuxe. . This procedure, 
which we'll detail, is very efficient, at embedding. grids. in. moderately large 
rectangular arrays of flawed cells. Another procedure, Blachoff, accepts as inputs: 

1) an essential network for a.machine embedded 


ed.in a perfect. array. 
This network is described.as interconnected rectanguler. grids. 


SOAs aig emi SERS Dieter eigen StS a ns a oe ee ae RS sate et ot 


PAGE 185 


2) a flaw pattern for = flewed array, where ech call is either perfect 

We temporarily assume that Repeir rived not consider the location of a 

flawed array’s input -outptit lines; ‘we assuthe thése “dre alteched after an array is 
repeired.” Well see that ‘Bléckott {s eavity motiitied:to: dbtisider the location of 


. input-output fines. Blockoft ‘partitions the flawed array ‘trite: rectenguisr blocks 


- separated: by Interconnection ‘strips; each: block te intended to Nola grid. 
_ Blockotf-then dske Twist Repair to determine how-to put a proper-sized grid into 
- each of the blovks. tf Twist Repelr certwt perform its tusk tor one of the blocks, 


- Blovkoft fells. Otherwise ‘Blockolt decides whitHier‘It’ can interconnect the proper 


"grid links extending: from each block. If It succeeds, Blockoff passes the feeulting 
deseription of the repeired array to Conitrict: if Blockoff tant interconnect the — 
grids, it asks Twist Repair for an alternate embedding for at least one block. in 
re-repairing any block, Twist Repelr continues ity rapelr attempts from the point of 
ite last ouecess. The process iterates; untif Mockoff ecctesds‘or falls. 

_—- Repeir-is oriented toward rectangular bléckt’ for eaverat reasons. First, 
this is the most natural, tractable structure in a checkerboard array. Second, the 
General cell iv sulted to rectanguier machines. Finelly, any checkerboard machine 
can be viewed as & composite of rectangles of various sizes. 

We first detail Twist Repair, and then Blockoff. We examine their 
response to actual embetiding problems, compare their performance to Arm Repeir, 
and note their limitations. We also suggest reasonable extensions of the Repair 
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procedures we’ve written. 

The simplest, most obvious way to embed a grid in a flawed array only 
uses a Cell as an essential node in the grid if the cell’s row and column contain no 
flawed cells. A good cell in a flawed dine - a row or column - enters the Cross 
state, so it interconnects essential neighbors. We'll call this repair technique 
Simple Repair. Simple repair of checkerboard arrays is analogous to Kukreja’s 
repair of cutpoint-connected arrays. 

Note that Simple Repair is the best possible grid-embedding repair 
when an array has few flawed cells. If an array has only one flawed cell, an 
embedded grid must have at least one less row and one less column than the 
flawed array; the flawed cell’s row and column are bottlenecks. 

Unfortunately, this Simple Repair is very inefficient as the number of 
flaws in an array increases: For such an array, we’d like an approach that is able 
to twist a grid’s lines through an array, so that some cells in flawed lines can still 
be used as essential cells. The L-turn and R-turn, cooperating with the Cross, are 
ideal for this purpose. Because of the way repaired blocks must interface, we 
assume a grid’s lines must extend from one side of a block to its opposite side. 

The Twist Repair approach, which includes Simple Repair, uses 
horizontal and vertical adjustment lines extending completely through a flawed array 
(see figure 4.6). Any flaw on an adjustment line must be at the junction of a 
horizontal and vertical adjustment line. Adjustment lines break the array into boxes 


- rectangular regions of cells. At most one flawed cell is allowed in each box. If a 
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Fig. 4.6 Flawed 15x20 Array Twist-Repaired Into A Perfect 10x14 Array 


PTX Er 


© Adjustment Line 


« Adjustment Line 


Explanation: 

The retcon network above indicates the states of good celis and flawed 
cells in a grid-repaired array. Flawed cells are indicated by an X. Good, unused 
cells in an arbitrary state-are indicated by *. Good cells that are essential cells in 
the grid are indicated by +, Other cells are used to interconnect essential grid 
cells. The L-turn state is indicated: by “, ©, or * depending on the context. 
Similarly, the R-turn state is indicated by ty, “ae OF ¥; and the Cross state is 
indicated by > or ¥. Note that jogging « wire requires the'use of at least two L- 
turn or R-turn states. 
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line of boxes is free of faults, adjacent boxes in the line interconnect across 
adjustment lines via Cross cells. If a box is in a row (or column) of boxes, some of 
which contain flawed cells, one row of the box is not used for essential cells. If 
the box contains a flaw, the flaw’s row is the row with no essential cells; all 
unflawed cells in that row of the box assume the Cross state. If a box is in a row 
of boxes with flaws, and the box contains no flaw, an arbitrary row may be put 
into the Cross state. Thus all the boxes in a row have the same number R of rows 
useable as rows of essential cells. Cross, L-turn, and R-turn states are used in 
adjustment lines between boxes in a row to yield R embedded grid rows extending 
through all the row’s boxes. 

Several considerations make the Twist Repair approach a reasonable 
one. Because exhaustive consideration of all repair possibilities is computationally 
excesssive, a reasonable, heuristic approach is necessary. Simple Repair is 
inadequate for most arrays with more than a few flaws. Twist Repair recognizes 
the equivalence of many specific embeddings. For instance, an adjustment line that 
doesn’t include any flawed cell may occupy any line of cells between two flawed 
cells; all such lines are equivalent. Recognition of equivalence limits computational 
difficulty. Furthermore, this allows Blockoff more flexibility in interconnecting 
blocks repaired by Twist Repair. We found that forcing L-turn and R-turn links 
onto adjustment lines results in far less repair confusion and inefficiency than less 
restrained use of these states. Consider snaking an embedded grid’s row through 


a flawed array of unbalanced cells, such as a General array. The only possible 
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essential cells in the snaking path through the flawed array are those cells which 
the path links to cells on the same row in the flawed array. This suggests that 
jogging of the tine and movement of the line in the vertical direction should be 
limited. Twist Repair often uses all the side-sets of cells in the L-turn and R-turn 
states; this efficiency helps minimimize the number of cells used as repair links. 
Twist Repair also attempts to place essential neighbors close to each other ina 
flawed array. This is helpful for two reasons. First, since wires between essential 
neighbors are useless as essential cells, it’s important to minimize the number of 
cells in each wire. Second, an embedded machine’s maximum speed Is limited by 
delays through wires; intended processing is only nee at essential cells. Our 
ultimate justification for Twist Repair is that it is better than any other methods 
we've considered for repairing small rectangular arrays to embed grids. 
The Twist Repair program's inputs are a flaw pattern and a request for 
“a minimum acceptable number of grid rows and columns. As in the arm 
experiments, a square array’s flaws are randomly generated. Starting with a good 
guess of where to draw adjustment lines, Twist Repair considers alternative 
adjustment line placements exhaustively - ignoring equivalent placements - until it 
succeeds. Table 4.1 is analogous to a table given for balanced arms, showing the 
best square grid Twist Repair embedded in experiments varying the number and 
distribution of flaws in the square array. 

Figure 4.7 shows curves based on the information in the table. The 


curves show the average of “oftotal for a given %flawed, for various array sizes. 


Table 4.1 Results Of Twist-Repair Gr id-embedding Exper iments 


(Ist of 2 pages) 


Key: “flawed - flawed cells as percent of ail cells 
cells - total cells in square array 
flaws - total flawed celle in array 
max-grid - the largest square grid our program embedded 
Zoftotal — max-grid as percent of cells 
timelim - time limit, in seconds. 
time - the time the program ran 
Zoftimelim - time as percent of timelim 
* - For two starred (or unstarred) arrays of. the same size, 

one set of fiaw coordinates is a subset of the other. 


Table: . 
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121 54 225 17 
144 64 225 16 
196 31 625 117 
Answer not found in timelim 
49 49 188 . 
36 36 188 . 
81 28 488 263 
188 25 408 231 
64 28 225 7.1 
64 28 225 2.4 
36 6 625 198 
Answer not found in timelim 
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Answer not found in timelim 
64 16 408 24 


Answer not found in timelim 
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Table 4.1 Results Of Twist-Repair Grid-embedding Experiments 
(2nd of 2 pages) 


%f lawed 


cells 


225 
225 
188 


flaws max-grid Roftotal timelim time 


iS 
1S 


36 
36 
bs) 
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16 
16 
3 
3 


225 
225 
188 
188 


Answer not found in 
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1 
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488 
625 


Answer not found in 


3 
16 
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486 
timelim 

39 
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Fig. 4.7 Graphs For Twist Repair Experiments 


%oftotal is averaged for a given value 
of Cells and %flawed. : 
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The smooth, consistent nature of these curves suggests the conclusions listed 


below: 


1) For a given array size, Zoftotal drops with increases in %flawed. 
This drop tends to be greatest for small Xflawed, milder as %flawed 
increases, and non-existent after Zoftotal reaches 0. 

Consider the curve of %oftotal as a function of %fiawed, for a 
given square array. Let E be the number of cells in a line of the array. 
The first flaw introduced into the array forces Zoftotal to drop from 
100 to <1] OQ(E-1)2>/E2, while %flawed increases from 0 to 100/E?. Thus 
the slope of the curve is 1-2E for %flawed near 0. This explains why 
Zoftotal drops faster for larger arrays in this region. 

Consider an array with several flaws. Introduction of a new 
flaw may not cause a decrease in Zoftotal. For instance, the flaw may. 
fall at the intersection of two adjustment lines, or in a box where a flaw 
had been assumed (to allow the box to interface with adjacent flawed 
boxes, as discussed earlier). Over the set of all flaw distributions for 
an array, the probability that a new flaw will not cause a decrease in 
%oftotal tends to increase with the number of flaws in the array. At 
worst, a new flaw will eliminate one row and one column of the former 
repaired array. If the former repaired array is smaller than the original 
array, i.e. if the repaired array has any flaws, at worst the new flaw 


decreases “oftotal less than previous "worst possible” flaws. These 
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considerations help explain the fact that Xoftotal drops less rapidly as 
%flawed increases. | | 
2) Zoftotal drops faster with %flawed for larger arrays, because a 

given flawed implies a higher percentage of flawed lines for a larger 
array. | | 

We've already analyzed this situation for Xflawed near 0. We 
found the negative slope of Zoftotal versus “flawed was directly 
proportional to a square array’s side length, E. As %flawed increases, 
the particular distribution of flaws influences Zoftotal. However, it’s 
easy to see why “oftotal tends to be smaller for larger arrays, for a 
given %flawed. 

Given a fixed %flawed, large arrays tend to have a higher 
percentage of flawed lines: the number of flaws is scegertiana to the 
area, but the number of lines is proportional to the square root of the 
area. Consider two arrays, one with E=10 and one with E=100, at 
%flawed = 1. For E=10, the one flaw implies Zoftotal=81. For E=100, 
the best possible distribution of 100 flaws puts each at one of the 100 | 
nodes associated with 10 horizontal and 10 vertical adjustment lines. 
%oftotal is then 81. Most other distributions require the jogging of grid 
lines, and Zoftotal is then usually significantly smaller than 81. One 
ahicene occurs in the unlikely event that all 100 flaws occupy the same 


row or column. The array is effectively cut, so Yoftotal=0. 
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A more general perspective provides a strong argument that 
moves in the direction of a proof. The flaw distributions in two arrays 
are equivalent if there’s a one-to-one mapping between the flaws in the 
two arrays. such that the following is true. If an arbitrary flaw in one 
array has a certain relative position with respect to the other flaws in 
that array, the corresponding flaw in the second array has the same 
relative position with respect to corresponding flaws in the second 
array. If one of a flaw’s coordinates is X, then the relative position, 
with respect to that coordinate, of a flaw whose corresponding 
coordinate is Y depends on which of the tive following, mutually 
exclusive, collectively exhaustive statements Is true: X+1<¥, X+1=Y, 
X=Y, X=¥+1, X>Y+1. 

Now consider to square arrays with different sizes, but 
equivalent flaw distributions. The first array has E rows, n flaws, and 
E-F grid rows in an optimally embedded square grid. As “flawed has 
climbed from 0 to 100n/E?, Zoftotal has dropped from 100 to 1O0Q(E- 
F)?/E2, The second, larger array has K.E rows. Since Twist Repair 
notices its equivalent flaw distribution, the second array’s optimum 
square grid has K.E-F grid rows. Here %oftotal has climbed from 0 to 

(K.F)? as Zoftotal has dropped from 100 to 100(K.E-F)/(K.E)2. The ratio 
| of the change in %oftotal to the change in %flawed is (2E.F-F2)/n for the 


first array, and (2K.E.F-F2)/n for the second array; an equivalent flaw 
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distribution is more costly in the larger array. Any flaw distribution in — 
an array has a corresponding, equivalent distribution in a larger array. 
However, the fact that not all flaw distributions in an array have an 
equivalent distribution in a smaller array precludes simply extension of 
our reasoning to a proof that, for larger arrays, Xoftotel drops faster as 
%flawed increases. It might be possible to make such a proof by 
defining some sort of loosely equivalent flaw distributions. 

3) For a given array size, Zoftotel drops from 100 to 0 fairly smoothly 
as Xflawed increases from 0 to a number N dependent on array size and 
specific flaw distribution. (For our experiments, 11.2 s N ¢ 17.78) This 
_ contrasts with growth of arms, where “flawed decreases gradually and | 
smoothly until it reaches a point where It plummets, usually for Zflawed 
‘approximately equal to 28. | 

4) Repair efficiency is much smaller for grids than for arms. 

5) For arrays with more than a few (approximately five) flaws, Twist 
Repair is far superior to Simple Repair. For instance, in the unstarred — 
array with 625 total cells and 20 flawed cells, Twist Repair embedded a 
14X 14 square grid. Simple Repair embedded a 4 X 4 square grid for 
the same array. 

6) The time to repair an array varies widely, even for a constant 
array-size and %flawed. The ratio of the time to repair an array to the 
number of cells in the array tends to be higher for larger arrays. For a 
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perticular array, the time for repair is relatively low when there are 
very few flaws. As flaws are introduced, repair time tends to climb 
gradually, reach a peak, and then descend rapidly. This is because 
repair time is roughly proportional to the number of non-equivalent 
adjustment line placements. lf an array has very few flaws, there are 
few non-equivalent adjustment lines. As flaws are introduced, the 
number of non-equivalent adjustment lines increases. Eventually an 
array becomes so crowded with flaws that it’s difficult to find an 
adjustment line that doesn’t include a flaw. If an adjustment line 
contains more than one flaw, several associated lines are required to 
satisfy the constraint that every flaw on an adjustment line be at the | 
intersection of a horizontal and vertical adjustment line. This reduces 


the number of non-equivalent adjustment lines for very flawed arrays. 


Experiments with Twist Repair suggest a new grid-embedding strategy. 
We notice that for a given %flawed, %oftotal tends to be substantially higher and 
Zoftimelim significantly lower for smaller arrays. This difference becomes more 
significant as %flawed increases, until %flawed is so large that all grid-embedding 
attempts are futile. This suggests that embedding a grid in a large array should be 
done by bresk “e the array into blocks of optimum size, separated by 
interconnection strips. Each block is repaired via Twist Repair, and its grid 


outputs are connected across the interconnection strips to the grid outputs of its 
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neighboring blocks. In fact, experiments show that such a procedure is superior to 
Twist Repair for large arrays with many flaws. 

A block's optimum size is determined by a tradeoff. Decreasing block- 
size tends to increase “ototal within each block, but it also decreases the total 
area devoted to blocks by increasing the number of interconnection strips. If 
%flawed is 0, it’s pointless to waste any cells on interconnection strips; there 
should be one maximum-sized block. As “flawed increases, the optimum block-size 
decreases. Assume that the overriding factor in embedding success is %oftotal in 
each block. For large enough arrays, the fraction of cells used in blocks, given 
each block has E cells in a line, is about (E/E+1)". This number is 100/121 for 
E=10, and 400/441 for E=20, Using the curves of figure 4.7, this indicates that 
E=10 is superior to E=20 for %flawed greater than about 1.5, given our 
assumption. This indicates how the curves and the value of (E/E+1)? may be used 
to suggest an optimum block-size for a given %flawed. Experiments with Blockoff 
have confirmed that there is a fairly predictable, optimum block-size for a given 
flaw density. This fact of an optimum block-size suggests improved grid- 
embedding can come from breaking an array into blocks whose approximately equal 
size is determined by the array’s flaw density. Then the simplest approach assigns 
identical sub-grids to ali blocks of the same size. This approach is limited when 
some blocks have a disproportionately high number of flaws. This situation often 
arises with current IC slices, where flaws tend to cluster. Since a very flawed 


block can only contain a small grid, that block is unable to link up with all the grid 
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outputs of its neighboring, less flawed blocks. This limite the number of grid-rows 
in its row of biocks. | | | 

Before considering how Grid Repair should handle blocking off an array 
containing flaw clusters, it’s useful to examine the repair problem more generally. 
it’s quite clear that a heuristic approach is necessary if Repair is to efficiently 
repair arrays with many flaws. Twist Repair is time-consuming, especially when 
one wants to place a near-largest grid into a flawed array. We'd therefore like to 
be able to determine a priori the feasibility of a certain repair, in. terms of 
computational difficulty and probability of success. This is particularly true if 
Blockoff is used to interconnect many blocks. Assume Blockoff operates on m 
blocks, and there are g,, satisfactory, non-equivalent sub-grids that can be 
embedded in block m. Let P be the product of g,, as n varies from 1 to m. There 
are P combinations of sub-grids which Blockoff may try to interconnect to form an 
embedded grid. if a high percentage of these P combinations are consistent with 
the desired grid, Biockoff may quickly succeed. At the other axivenn: Biockoff 
would spend a time proportional to P in vainly considering each of the 
combinations. 

Happily, Repair may use a rather simple heuristic approach to reduce 
repair time. Let F be a success function which estimates the gid-size that Repair 
can reasonably expect to embed in a given array. Most simply, F isa function of a 
square array’s size and its flaw density. F can be refined in various ways we'll 


consider. For instance, an input to F could state the probability F’s estimate is not 
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an over-estimate. If non-square grids and blocks are considered, F can depend on 
their specified shapes. It’s reasonable to obtain F experimentally, because of the 
monotonic nature of F. For instance, we’ve observed that F’s output decreases as 
an array’s dimension or flaw density increases. This monotonicity enables us to 
estimate F by experimentally determining some of its key values, and interpolating | 
to find its other values. F is Repair’s heuristic guide. 

Now consider the following procedure adapted to embedding a grid 
containing R rows and C columns in an array that may contain iawstiostee Repair 
uses F to break the flawed array into approximately equal blocks whose size 
depends on the array’s dimensions and average flaw density. F suggests the block 
size that is expected to yield the maximum embedded grid. Repair then considers 
each line of blocks, associating with each line a number equal to the number of 
lines F associates with the most flawed block in the line. Thus Repair recognizes 
the difficulty of snaking a grid’s rows or columns through a cluster of faulty cells. 
Repair finds the sum S of all the numbers associated with the row lines. If S < R, 
embedding the specified grid will be difficult or impossible; Repair’s action 
depends on whether it’s willing to spend a lot of computation on what is probably 
a vain effort. (This decision can be made implicit if a success-probability 
parameter, like the one we’ve discussed, is passed to F.) If S = K.R, where K is 
greater than or equal to 1, Repair multiplies each row number by about 1/K; so 
that all the row numbers sum to R. An analogous procedure is applied to the 


columns of blocks. If Repair decides to call Blockoff, Repair has heuristically 
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determined the size of the sub-grid assigned to each block. Thus F facilitates a 
heuristic: for guessing whether # given =e Tt ” the: > sasirment 


we “given to Biockett.” 


We've noted that grid-emnbedding Is the ‘most difficult repair problem in 


a checkerboard stray. The major practical importence of a repair procedure — 


flawed ‘array specified se a large grid, Construct may embed in the flawed errey 
any machine whose essentie! network is  eub-network of the grid’s essential 
network. However, thiv generality diminishes the efficiency with which a non-grid 
machine is’ pres Nieetaa ‘Wale aaen ins for the extreme case of 
riven | ? 

| The Blockoff procedure we've written andihaibided to accept an 
evtentiat network containing rectangular sub-grids with specified wires between 
adjacent sub-gridé. it’s easy to see why noticing limited communication paths 
between a machine’s high-reicon bai proinotes ettcloney: | ‘Blockotf t operates 
under fewer cotetramnts. . 


geared toward grid-embedding Is its generality; if Répal 


Table 42° and figures 48 and 49 summarize a series of experiments 
that begins to explore how block-size and rilssing i ike 0 affect os high- 


“Rie 


relcon machines. Table 4.2 summarizes the daté ‘from the experiments, and figure 
4.8 and 4.9 give Biockoff-produced pletures of repsired arvays, 
The experiments all used Xfiawed = 5, wivel figure 4.7’s curves 


suggest iss region’ where block-size of I0xf0 is better then biock-size of 20x20. 
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Table 4.2 Experiments With Three Tavera = 


Blockoff operated on arrays with 5% faved = For a given flawed 


array, at mest. ti arent ormed. in each 
experiment, Bl tes proni $ e the s are 


3 plane ame~i2 
array’s approximately equal-sized blocks. ‘Blockoff’é be: 
entry. ess otherwise: noted, bare four ‘it ngossib 
result, given the co sckoff Blocated 10 x 10 blocks for 
each grid, and tried té Cohniec \ fo. one large Lil deel lth In 10-noconnec 
put gic erconnect the smal 
‘locks for each grid, and 

vs figures contain the 


grids. ‘In 20-connect,. | 
interconnected on t. 


Array Experiment § Best Blockotf Result ” : Time (seconds) 


18x18 18-connect > : 6x6 grid” pt NR 2.3 
20x28 18-connect 4 4x4 grids = 8x8 47.8 
28x28 18-nocennect — : 4x4 grids . oe 9.2 
28x28 28-connect. = 11.« 8«8 grid. ie. 63.1 
+ 48x48 18-connect. 16. 3x3 or id - ida ee 58.7 
48x48 18-noconnect. 16 4x& gr o 4 32.2 
48x48 28-connect 4 SxS ae = 16x18 832... 

« 88x88 18-connect 64 2x2 grids = 16x16 — 963. 
88x88 18-noconnect 64 3x3 grids 188. 

! 88x88 28-conmect: ms Ris SR 


+ When asked to put 16. x4 grids in. this, ln 
after 45 minutes. Then we.interrupted and 


* When asked to put. 6A.3x3 grids in ths flawed: wurey Blockoff was still thinking 
after 27 minutes. Then we interrupted and terminated. off. 


1 When asked to put 16 3x3 or 4x4 j 
thinking after 22 minutés' and 9.5 minutes, r 
terminated Blockoff. eas 


array, Blockoff was still 
Then w ed and 


In thie § 
pc icra 
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Fig. 4.8 Blockoff’s Repair Of 20x20 Array With 5% Flawed Cells 
(2nd of 2 pages) 


C) 20-connect embeds. one 8x8 grid 


ig. 4. ockoff’s Repair Of A 40x40 Array With 5% Flawed Cells 
(1st of 4 page 
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Fig. 4.9 Blockoff’s Repair Of 40x40 Array With 5% Flawed Cells 
(2nd of 4 pages) 


B) 10-noconnect embeds sixteen 4x4 unconnected grids 
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Fig. 4.9 Blockoff’s Repair Of 40x40 Array With 5% Flawed Cells 
(3rd of 4 pages) 


C) 20-cconnect embeds four 5x5 interconnected grids 
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Fig. 4.9 Blockoff’s Repair of 40x40 Array With 5% Flawed Cells 
(Ath of 4 pages) 


D) Only one link between adjacent 4x4 grids 
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Indeed, 20-connect always took substantially longer to repair an array than did 
10-connect. Furthermore, 20-connect never embedded a larger machine than 10- 
connect, and sometimes embedded a smaller machine. 10-noconnect always 
embedded at least as many essential nodes as the others, because 10-noconnect 
works on a grid with links missing. We commanded Blockoff to place the same 
square grid in each of an array’s blocks, because we didn’t want to help Blockoff 
by implicitly telling it the location of flaw clusters. This constraint on Blockoff 
limited its performance; this explains why we can see ways to snake extra grid 
rows and columns through the flawed arrays. Figure 4.8.A indicates that the 
lower-right block of the 20 x 20 flawed array limited the performance of 10- 
connect and 10-noconnect. Similarly, figure 4.9.8 shows that certain very flawed 
blocks limited Blockoff’s performance. This argues for use of the success heuristic 
suggested earlier. Figure 4.9 indicates that Blockoff’s performance diminished as 
more links were introduced between sub-grids. 

Comparing the graphs for Twist Repair experiments with table 4.2 
shows Blockoff’s superiority to Twist Repair as a flawed array’s size increases. 
For %flawed equal 5, Twist Repair achieved a %oftotal of 6 for a 25 x 25 array. 
This indicates that for 40 x 40 and 80 x 80 arrays, Twist Repair would have 
achieved Zoftotal substantially under 6. For %flawed equal 5, Blockoff used 10- 
connect to achieve a “oftotal of 9 for a 40 x 40 array, and “oftotal greater than 
or equal to 4 for a 80 x 80 array. This and other comparisons we’ve made of 


Blockoff and Twist Repair indicate Blockoff is superior when %flawed remains 
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constant as an array’s size increases. eee 
We wish we could offer snore. experimental. results. from Repair. 
However, Repair’s <e computation: time, demends have made further experiments 

unfeasible. i . ieee 

We now re-consider the improved Repair prececure,, We seg that it 
is oriented toward embedding a machine abstractly deecribed as interconnected 
rectangular sub-grids. Repair may use. a heuristic approech.to decide, what.part of 
king such an assignment, 
-Repeir calls Blockoft. Blockoff may use e heuristic like Repalr’s to decide how to 
_ embed each sub-grid. If the sub-grid is sufficiently. vmell, Twist Repair is 
appropriate. Qtherwise Blockoff may break tha subrgrid. into blocks, and. present 
Repair with each sub-grid. That is, Repair.may be recursive. In: any case, en.array 
is eventually broken into blocks repaired by Twist Repair, and interconnected by 
Blockoft. 


a flawed array should accept each sub-grid After mp 


We've purposely ignored. discussing. an embedded maghine’s 
__ Interconnections to other machines, either in or out of its array... Tradeoffs relating 
___ to this question, ere anelogous to those for erm-embedding. Blockoff may be easily 
adapted to accepting inputs describing which of the celis.at a machine's periphery 
carry the machine’s inputs and outputs. Handling this.is like. handling. the interface 
between linked sub-grids. In each case, a particuler call. (for, instance, one. with a 
lead to the extra-array world) should connect to @ particule, © 

One can envision further levels of. Repair. sophie 
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increased embedding efficiency. Choice of soptistication level depends on the 
character of expected repair problems. For instance, very large arrays might 
_benefit by interédiinéction strips wider than the line between large blocks. If 
substantial sections of an essential machine were likely to have eseehtial cells with 
‘ few essential neighbors (as the General procedsor-tester-repairer did), 
' efficiencies would result from special hatdling ‘of these sections. Indeed, perfect 
machines’ should probebly’ be’ designed’ih &°modular’ fedtlor, with relatively few 
communication paths’ between ‘modules. ’ The need to fimit inter-module 
We briefly sketch a promising repair techhique for such high-reicon 


~~ ‘communication paths is already recognized In'the design 


machines. An essentiel network Is categorized in the ‘following way. Each 
“essential node with-three or four essential neighbors 9 associated with some 
rectangular Atgh-relcon block in a compact Blockoft-cémpatible way that’s been 
discussed: “Those vives ard essential cells with one or two essential neighbors 
~ that are not in a high-rétcon block are associated with Jow-reicon blocks (see figure 
4.10). A straight horizontal or vertical fine throligh sf eeberitia 


network” passes 
through at feast one high-relcon or low-relcon block.” That block which the 
Success Heuristic F estimates as least efficiently repaired, given the flawed 
array’s average flaw density, determines How many flawed array lines should be 
allocated for an‘ essential network line. For instance, the’ expected embedding 
efficiency for the large high-relcon"block-dictates the fiuriber of flawed array 


eolumns devoted’ to’pertect array colutins 0 through ® The relatively high 
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Fig. 4.11 Blocking Off A High-relcon Essential Network 
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embedding efficiency of transmission links dictates a lower multiple of flawed array 


columns devoted to column 6. Thus Répsir Uses the success heuristic to estimate 


whether a repair will succeed, ond to get-e rough estimate of how to allocate 
flawed array space. Repsir may then ett rit estimates by considering the 
actual number of flaws in each allocated block.” Repsir then uses Biockoff to repair 
the high-relcon blocks. Given a success here, Repair calis a racers devoted to 


~ all the low-retcori ‘epsontiet celts and wires that ail gen block. The 


’ main point of. thie Spprench Is to retin s as Lawes ee 


asentlal celle. with flawed 


vain, es stig by Bleck to 6 repeir an 
array blocked off in an unrepsirable way. . 

if Repair uses details of a machine’s essential network to increase 
embedding efficiency, Repair needs a description of that network In the least 
sophisticated case, a designer could specify that network to Repair; we’ve done 
this in our experiments with Blockoff: However, it is fairly easy to write a 
procedure which abstracts a veashltiate essential network from its description as an 
embedded machine. The procedure “works back” from the embedded machine’s 
outputs to find the essential cells and wires of the machine. The resulting 
essential network could be blocked off by analysis of the location of high-relcon 
regions. Straight lines through the network that yielded a low density of links 
would indicate reasonable boundaries between high-reicon regions. 

We've discussed an effective Repsir procedure, and actually written 
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and analyzed fundamental components of this procedure. Nevertheless, repair of 
high-relcon machines remains a largely unexplored area. Some programs we’ve 
sketched remain to be implemented. Perhaps better methods of repair can be 
found. More experiments would enable a better understanding of the heuristic 
success-function’s nature. Interesting theoretical questions remain. Consider the 
curve of the expected width of a square embedded grid versus the width of a 
square flawed array, for some low, non-zero flaw density. Is there a repair 
procedure such that this curve is monotonically increasing? Is there a repair 
procedure such that the curve is above some positive-sloped straight line for very 
large arrays? Can you produce such a procedure, or prove there isn’t one? This 
is an important question, because its answer tells us the secentail size limits on 
grid machines embedded in arrays of a given flaw density. This helps us determine 


the expected size limits of high-relcon machines that aren’t grids. 
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Section 4.5: Construct 
Construct accepts three information inputs which dictate how Construct 
loads cells in a flawed array. These Inouts are: 
1) a description of an essential machine stating essential states and 
associated wiring; | | 
2) a dascription of a repaired array, in which each cell’s processing 
layer function is in one of the four categories we've mentioned - 
flawed, essential cell, particular non-branching transmission state, or 
unused good cell in arbitrary state; and 
3) a description of the repaired array stating the side-sets 


successfuly activated and de-activated by Test’s loader. 


We’ve noted that Construct’s precise nature depends on Repair’s 
generality. In any case, Construct is very simple. First Construct “mentally” maps 
a machine’s essential ceils into a repaired array’s essential nodes. Then Construct 
extends a loading arm into the flawed array, possibly touching all good cells and at 
least touching and properly setting all the cells acting as essential ceils or wires 
between essential cells. The loading erm’s base may be any cell with access to 
the cells that must be set. For instance, any of the cells of an embedded machine 
would be an acceptable base. Setting the proper cells is even easier than growing 
a long arm into an array. Construct knows the location of flawed cells, and may 


extend, retract, or nove its arm through side-sets Test successfully activated in 
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any way consistent with touching all the proper cells. Figure 4.11 shows the 
result of a simulation demonstrating Construct’s ability to perform its task. The 
simulating procedure moved its arm in a flawed array. All cells were initially in 
either the X (flawed) or G (good) state. For simplicity, it was assumed that all 
accessible side-sets of good cells could be successfully activated and de- 
activated. The arm. moved around in the array until all touchable cells were 
touched. (This 2 doing more than is necessary.) The figure shows the state of the 
loading arm when Construct succeeded. Of course, Construct could completely plan 
its loading strategy via such a simulation before actually extending its arm into an 


array. 
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Section 4.6: Other Considerations In Realizing High-relcon Machines 

We've considered the basic issues of testing, construction, and repair 
for high-relcon machines in previous sectons. Now we turn to less fundamental, 
but important, aspects of high-relcon machines. We considered production and 
marketing issues for arm machines in section 3.4. We suggested ways to satisfy 
constraints imposed by the need for adequate array-access ports (chapter 3 called 
them “arm bases"), the need for proper handling of shared power lines, ‘ana 
volatility. These constraints have obvious analogs in high-relcon machines. 
Because satisfaction of these constraints is also obviously unalogous, we need not 
consider these constraints further. Instead we concentrate on considerations 
peculiar to high-relcon machines. 

All the testing procedures we've discussed assume independence of cell 
behavior. Gradual growth of an arm machine involves concurrent tests of an 
individual cell and its associated machine. As soon as the ‘ast cell of an arm has 
been tested, the arm is complete and tested. On the other hand, high-relcon cells 
are independently tested before they’re included in an embedded machine. Testing 
an embedded machine, or its modules, checks our independence assumptions. An 
embedded machine may be tested like any digital machine, via its inputs and 
outputs. Furthermore, test-link capability provides testability to high-relcon array 
machines that’s not available in ordinary digital machines. Test links may connect 
an embedded machine’s module with a test machine, to allow independent testing 


of that module. A test link, terminated by a transmission-branch cell, may be used 
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as a probe which, at a given time, monitors the signals on a wire in an embedded 
machine. After test links are used for module testing or probing, they can be 
withdrawn. Of course, this assumes that the test arms do not affect the operation 
of the eventually embedded machine; this is a safer assumption than a simple cell- 
independence assumption. 

. The use of cells as wires in high-relcon machines necessitates special 
considerations. In most hard-wired machines, it’s safe to disregard the delay 
through wires; but this assumption is usually not valid in high-essential machines 
because the delay through a wire cell is close to the delay through some other 
cell. A pair of essential neighbors may be linked by different-length wires in 
different flawed arrays. Wire delays consequently decrease an embedded 
machine’s maximum operation speed. Furthermore, they compound the "critical 
race” problem, thereby making array machine designs more constrained than non- 
array designs. A synchronous high-relcon machine must be clocked slowly enough 
to allow for the delay through embedded wires. Other conventional techniques for 
solving timing problems, such as ready-acknowledge signalling, may be employed 
where needed for communication between modules in embedded machines. 

Array machines compensate for inherent limitations by providing added 
capabilities, including automation-compatibility. We’ve seen that a simple array 
facilitates testing and repair by its iterative nature, and by the fact that test and 
repair facilities are built into a cell. An array’s simple structure also facilitates 


computer-aided design. A designer could specify a machine as a perfect 
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embedded machine with timing constraints on.its. cells. A.simple program. .could 
check that an envisioned embedding satisfied these constraints. A more 
sophisticated program could "compile" a machine’s high-level-language specification 
into an acceptable array-embedded machine; this is difficult, but easier than 


analogous computer-aided design in a less regular environment. 
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Section 4.7: High-retcon Machine Applications | 

We've discussed our approwch to the most difficult testing and repair 
processes for a checkerboard array = treatment of Ngh-relton machines. Abstract 
description of an essential machine as an essential’ network focuses on the 


¥ We've’ shown that 


properties of a machine that are important to fest ard répe 
high-relcon machines have higher testing and repair costs than arm machines. 
We've also shown that, even for high-relcon machines, our cellular approach offers 
major integration, test, and maintenance advantages relative to other methods for 
system implementation. in this section we consider applications merits of high- 
reicon machines, relative to arm machines and non-array machines. We discuss the 
General ceil as one which enables realization of the benefits of high-relcon 
machines. | 

Chapter 2 discussed the general advantages of cellular arrays, and 
argued for our array approach. This approach attempts to meet system design, 
production, and ralntenence: nobis through standard, high-volume, flexible, 
automation-oriented modules - dalle and associated programe. We compared our 
approach to other, less constrained approaches. Chapter 3 discussed balanced arm 
machines using our approach. Earlier sections of this chapter compared testing and 
repair processes for arm and i plicreleen machines. This section highlights 
‘performance features that haven't been sufficiently covered. 

Because tne communication paths between celle in a high-relcon machine - 


are less constrained than those in an arm mechine, a high-reicon machine provides 
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speed. and flexibility advantages in certain applications. The. quitability of a 
particular type of essentie! machine.depends on the: utility of @ given degree of 
direct, simultaneous inter-cell communication.for thet,amebine. A-serini-in, serial- 
out shift-register is well-suited to arm.machines, because each stage of the 
register communicates directly. with at most two. other. stages. : Tree.-machinas ‘are 
well-suited to machines which have only ene section: addressed .at @ given: time; 


therefore random-access and some other memories are well-suited to reglization 


as tree machines. In an arm machine, essential neighbers are: always in adjacent 
cells. In a grid embedded in a flawed array, esagntial neighbors aren't necessarily 
in adjacent cells; this diminishes the speed advantage of the embedded grid 


machine. High-relcon machine realization is. particularly suited: to machines 


composed of modules which communicate Gifterent..| mn. ta three.or more 
other modules at the same time. Such. machines. might require. complex celis to 
Sieh awkwardly share the communication paths available in tree or erm: machines, 
For instance, building a processor as an arm.or tcaa;machine would probebly 
require complex cells, and suffer from low parallelism. .Forcing all a machine’s 
extra-array leads to connect to one cell makes realization of certain. machines very 
difficult Thus high-relcon arrays provide. additional information. paths, but require 
higher testing and repair costs when. they. are used for bigh-relcon machines. A 
high-relcon array is most suited to machines which. exploit the) array’s available 
information paths, such as the processor-tester-repsirer built of General cells. _ 


High-relcon arrays, such as General, arrays, offer major advantages as 
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peripheral equipment in ‘a computer system.’ “Fhe computer system offers the 
array a non-volatile Array Programmer. “The: array offers a reliable, inexpensive, 
cobewhiine: high-speed processing capability. ~ 

For many machine tasks, the General array is a correct ‘compromise 
between a single-sequente computer and a spacial fiard-wired machine. A typical 
computer's performance advantages include computational power, flexibility, and 
easy programmability. “its major dibadveritage is slow performance relative to 
hard-wired machines. The ‘single-sequencé ‘computer ‘processes only one 
instruction at a time, with each Inatruction taking many gate-delays. The 
ameliorating parallelism in some instructions is often wasted. For instance, an 
algorithm that only: operates on’ 1-bit words ‘stil’ Uses an AND that operates on 
larger words: Fhe coriventional computer is particularly ill-suited. to irregular or 
high-frequency real-time applicatiéns; harigling incoming signals through interrupts 
is: particularly time-consuming’ and ‘tricky. “Computéra’ are so clumsy at real-time 
applications that:they often rely on a’hard-wiréd machine to buffer incoming 
signals; : this machine continuously monitors, collects, and pre-processes incoming 
data. Many applicetions are more ‘suited to a épecial-purpase machine, which 
offers higher speed. Disadvantages of such a machine include high setup times and 
high setup costs, especially if these costs are not distributed over a'‘large number 
of machines... Testing and repeir of these mechines’ tan ‘be’ particularly difficult and 
a | i, ks i a ae ee re 


A peripheral array, such as the General array, is a compromise between 
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the performances of these two most common machine approaches. The array may 
be quickly and easily programmed to one of a large set of embedded machines. 
For instance, a processing-intensive problem could be solved in an array which 
interrupted the computer’s processor when it had solved the problem. Alternately, 
a General array could be used as a processor compunent tailored to the 
requirements of a particular processing task. The array provides a high degree of 
potential parallelism. Basic cell operations, those that occur in the cell’s function 
states, are faster than basic computer operations, but slower than the basic 
operations of a special-purpose hard-wired machine. Like a hard-wired machine, a 
General machine can continuously monitor and process incoming signals. 
Furthermore, our arrays have the added advantages of low cost and easy, 
automatic maintenance. 

Of course, an array’s suitability depends on its intended application area. 
The General array is oriented toward narrow data sees there is only one 
processing input in each of a cell’s side-sets. Parallel algorithms, especially those 
amenable to two-dimensional array solution, are particularly appropriate for the 
General array. Many physical problems, such as temperature distribution on a 
plate, are consistent with such an array solution. Such an array might benefit from 
larger processing side-sets to accomodate numbers representing one of a wide 
range of temperatures. However, such a macro cell with large side-sets could be 
built of General celis. The General array is good at logic simulation. Real-time 


applications which would otherwise require an expensive, low-volume special 
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machine are often suited to a high-relcon array. 

An array’s utility depends on its size. This partially explains our 
interest in array repair. One way to increase an array’s size is to interconnect 
arrays. If one’s objective is a large array with a checkerboard array’s 
interconnection network, one must currently make many interconnections between 
neighboring arrays. This is fairly costly, even if one uses a special interconnect- 
array circuit board, because of the many IC leads involved. Our approach reduces 
the need for many leads between sub-arrays by relying on testing and loading 
arms, and by Repzir’s block orientation. This block orientation recognizes that 
most machines are composed of modules, and have few communication paths 
between the modules. 

A high-relcon array may also replace special-purpose machines in a 
‘computer system. Here the array is most appropriate when computer- 
maintainability is important. 

The ability of an array-embedded machine to test, program, and repair 
its cellular environment is particularly attractive. Such a machine can form its 
cellular environment into machines appropriate to a given application at a given 
time. Two or more machines like the one we’ve designed can achieve high 
relibability by monitoring and repairing each other. Each machine is embedded in a 
sea of spare parts, cells, with enough cells to support many cell failures. When 
one machine notices that the other has failed, it re-tests the other’s environment 


before embedding a new, perfect machine. With three array-embedded machines, 
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a first good machine may continue normal operation while the second good one 
repairs the faulty machine. 

It’s amusing to consider the unlikely event of a form of “array cancer”, in 
which a faulty machine attempted to wipe out a properly working machine. Each 
embedded machine could guard against inappropriate attack with test arms for 
noticing attack, and a loader arm for fighting the attack. A machine could be 
programmed so that both its defense and attack programs required proper use of 
all the machine’s processor sections. With the right attack and defense programs, 
a perfect machine should then be able to dominate a malicious, faulty machine. 

If the General array is non-volatile or easily backed up by a power 
supply or loading source, it may be mass-produced and program-customized to 
provide inexpensive, low-volume machines inappropriate to microprocessor | 
realization. Sometimes added advantages come from the machine’s nature as a 
standard part that can be tested, programmed, and repaired through limited 
communication with a standard machine. Our arrays can even be repaired by a 


remote machine connected to an array via communication links. 
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CHAPTER 5: TREE MACHINES 

This chapter discusses embedded machines with a particularly simple 
nature. All cells of a tree machine are effectively linked to a common input bus 
and common output bus. Each cell is a balanced, essential cell whose function 
state includes a unique name. At any given time, only one cell may transmit its 
information out of the spibceliod machine. | Examples of such a machine are paged 
random-access and track-addressed sequential-access memories, with one cell per 
page or track. The embedded machine’s simplicity means that a cell’s processing 
layer can be designed so that all tree-like relcon networks with a given number of 
nodes may correspond to aciivalent embedded machines. This allows efficient use 
of good cells in a flawed array, because an embedded machine can incorporate any 
good cell linked to its input-output (tree base) cell by some path of good cells. 
Because one form of tree is an arm, a flawed array embedding a tree machine can 
be repaired at least as efficiently as a corresponding array embedding an arm 
machine. Furthermore most large, flawed arrays may be repaired to embed 
random-access memories or other tree machines with average access time 
proportional to the square-root of the number of cells in the array. 

For specificity, we consider a paged random-access memory 
implementation with the following characteristics. The RAM has 2° pages, or cells, 
with 2” words of length L in a RAM on each page. Command and output words are 
handled serially. The RAM has two input lines called Klock and Command, and one 


output line called Return. When the RAM is ready to receive a command 
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specifying a “Read” or "Write" operation, the Command stream Klocked into the 
RAM specifies the following: 

1) a p-bit page address which selects the one cell of the embedded 

machine with the identical name stored as p function-specification state 

bits; 

2) a w-bit address selecting a particular word within the page; 

' 3) a Read/write bit specifying either a "Read" or “Write” operation; 
and 

4) if the command is a "Write", the L-bit word to be written. 

If the command is "Read", the L Klock pulses after the commund Klock the selected 
word out of the embedded machine. 

Since the paged-RAM cell’s loader is the same loader detailed 
previously, we fob on a balanced processing mechanism and associated function- 
specification state bits for cells in a checkerboard array. Each of a cell’s side-sets 
has one irsal input line specifiying whether that side-set is relected to send Klock 
and Command information directly into the cell. A cell in a working embedded 
machine has only one of its Insel inputs high. The Insel-selected Klock and 
Command information is broadcast to the cell’s neighbors via the cell’s Klock and 
Command Outputs. A cell’s broadcast Return output is that cell’s RAM Output line 
if the celi has been addressed; otherwise the Return output is the OR of from 
zero to three Return inputs selected by four Retsel function-specification state 


bits. Each Retsel state bit corresponds to one of a cell’s side-sets. Besides 
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Fig. 5.1 Relcon Networks For RAMs In identical Flawed Arrays 


A) Relcon network for one embedded RAM 
xXtxX 
x 
B) An embedded RAM with better access time than A 
xix 
x 


Commands input to a tree’s base flow to the tips of the tree. Every link 
that carries an input command in one direction carries a Return in the 
opposite cirection. An addressed cell’s Output information successfully 
reaches the embedded machine’s output because the Output is ORed 
with Os as it flows to the embedded machine’s output. Maximum access 
time is minimized by minimizing the longest path between a tree-tip and 
the tree’s base. Machine B’s access time is better than A’s because A 
has a circuitous path to node (3 0). The best expected access time 
results from placement of a tree’s base at the center of its associated 
array. 
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determining whether a cell’s left Return input is selected to be ORed, the “left” 
Retsel state bit is also the cell’s left Insel output. A corresponding statement is 
true for a cell’s “right”, "up", and “down” Retsel state bits. Thus an embedded 
machine is organized so that cell A accepts a Return input from cell B if and only if 
cell B accepts Klock and Command inputs from cell A. Input command information 
enters a cell from one of its neighbors, and is accepted by up to three of its other 
neighbors. Hence a given RAM with c cells can be realized by any tree-like relcon 
network of c good cells consistent with the limits imposed by an array’s 
interconnection network. Figure 5.1 shows relcon networks for two equivalent 
machines in identical flawed arrays. The machines differ only in their access time. 

in a checkerboard array, access time is minimized by placing a tree’s 
base cell at the center of a square region of cells; one diagonal of the square is a 
row, the other is a column, and the diagonals cross at the tree base cell. When 
such a strategy is used, the expected time required to senc information to or from 
the tip of a tree embedded in a flawed array is proportional to the square-root of 
the number of tree cells. Expected access time is therefore proportional to the 
square-root of the number of cells in large tree machines. In an n-dimensional 
array, this expected access time is proportional to the “n"th root of the number of 
tree cells when the tree’s base is at the center of a cube o” hyper-cube. 

Since the cell we’ve discussed handles information serially, it needs a 
counter and associated circuitry to coordinate activity. This counter is initialized 


by the loader. 
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Techniques used for improving performance of conventional RAMs, such 
as use of parity bits, are applicable to this approach. The RAM in each cell is 
identical to conventional RAMs. The fact that a loading arm can rapidly shuffle the 
names of cells in a machine without disturbing their RAM contents may be useful 
for some systems’ memory management. If a simple paging system is willing to 
effectively construct a page table by shuffling the names of memory cells, a special 
page table and its associated delays are not iécived 

Test and repair of flawed arrays embedding tree machines is similar to 
test and repair for arm machines. A tree machine is grown cell-by-cell into the 
area around its base, and each extension is monitored by communication between 
the tree’s base and the Array Programmer. A cell in an embedded machine is 
always linked to the base cell by the shortest possible reicon path, and given a 
unique name. A working cell in an embedded machine ignores inputs from flawed 
cells and dangling array inputs. 

Packaged memories are easily formed into larger memories by providing 
a few links between packages to allow growth of the-tres through all the 
packages. The number of cells in the tree is only constrained by the required 
access time and the number p of page-address bits .in each ceil. 

Overhead circuitry could be reduced by using triangular arrays instead 
of checkerboard arrays, if this was compatible with the production process. 

It’s obvious that this approach is applicable to any machine which may 


be realized as a tree machine. Inputs and outputs to such a machine could be 


PAGE 232 


parallel rather than serial. One could implement a many-tracked sequential-access 
memory, with one cell for each track. Associative memories and even some multi- 
processor systems (similar to the ETHER system) are compatible with this 
approach. | 

These tree machines further evidence the fact that relaxing the 
requirements on the communication paths between essential cells in an embedded 


machine facilitates repair efficiency. 
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CHAPTER 6: CONCLUSION 


This thesis has presented an LSl-oriented systems approach to test, 
configuration, and repair of cellular arrays. We've specified standard modules that 
are built into the cells of a machine to facilitate testing, loading, and repair. Thus 
the mechanisms for testing and customizing a flawed array are built into a simple, 
iterated part. A computer may access these mechanisms via a few direct 
connections to an array. Programs allow the computer to maintain or re-customize 
the array. We’ve been careful to note our assumptions, and to discuss design 
approaches that help insure the validity of these assumptions in actual arrays. 

Development of terminology and models for programmable logic machines 
has helped us analyze important machine classes; these are arm, high-relcon, grid, 
and tree machines. A particular class of machine is characterized by the 
requirements placed on the communication paths between essential cells of any 
embedded machine in the class. A particular embedded machine is associated with 
a set of equivalent embedded machines. The nature of this set affects the 
testability and repairability of an array. Properties of a cell, such as balance, 
affect an embedded machine’s structure and associated equivalence class; and 
therefore affect the repairability of an array. 

There are reasonable practical and theoretical extensions of this work. 
We believe that tying further theoretical inquiry to actual machine realization goals 


will be most productive. 
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Tree and arm machines seem particularly. suited to. immediate array 
re-customize, Furthermore, their relatively low cell-circultry overhead and high 


realization These machines are relatively simple to ay! 


_ Fepair-efficiency give them major integration-level advantagrs. 
Arm or tree machines may first be  construrted, in. a. eystem containing 
many ICs. Such a system, would enable further, exploration ond demonstration of 
the feasibility of our epRroach. The Major advantages, of a many-IC system, 
compared to system integrated on one sce, arp ite Jow development cont and 
Nigh component sccesity. Such «system shod be abl to function whan some 
_ Of its ICs are removed or destroyed, and some af its wires are cut. The many-iC 
system would enable a to refine our designe and our teat and repair programs. 
The. major limitation of a many-IC research vehicle is that it doesn’t cceeaaly 
model our ultimate goal, a cellular system integrated on one flawed slice. 

Besides its obvious value as a system Componen-, a single-slice. tree or 
arm machine would help anewer important, questions relevent, to other arrays. How 
_ accurately do our assumptions mode! actual conditions, on @ flawed slice?, .. How 
significant is the branch cell problem? How do power supply, heat dissipation, 
array size, and other practical considerations affect array, implementation? 
Answers to these questions will depend partly onthe engineering skill and 
__ Production ee of array developers. Grea and tree machines are simpler to 
implement then many other programmable logic machines, ability to implement these 


| arrays is 8 sine qua non for practicality of many other proposed errays. 
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Arrays like General are the most exciting, because of their use of 
simple cells acting in parallel to provide universal computation-construction-repair 
capabilities. These arrays offer speed, reliability, and flexibility advantages in a 
low-cost integrated circuit. Current IC densities and yields probably don’t allow 
practical realization of these large arrays. However, densities and yields are 
improving so rapidly that these arrays should be feasible before 1980. By then, 
many questions pertinent to these arrays should have been solved for tree and 
arm arrays. Continued work on testing and repair, and development of computer- 
aided design facilities for these arrays, will be important to their commercial 
success. Consideration should be given to the machine organizations most suited 
to high-relcon machines. 

The first use of our approach to high-reicon machines may be in many- 
IC arrays of fairly complex machines, such as microprocessors. This is true 
because these arrays are closer to eSavanlise digital systems. Unfortunately, 
the fact that such arrays have relatively low basic operation speed compared to 
General means they don’t use high-relcon arrays to full advantage. Nevertheless 
we’ve seen the advantages of building simple test and repair mechanisms into an 
iterated component. 

Since our test, configuration, and repair techniques may be adapted to 
existing arrays, it would be useful to categorize these arrays according to their 
realizability as an arn, tree, high-relcon, or other class of embedded machine. 


Other inquiries may take numerous directions. A more rigorous 
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treatment of our testing aesumptione and approach would be useful. Many 
questions remain concerning the repsir of chackerboard-errays that embed high- 
relcon machines. These questions concern the best way to repair these arrays, 
and the limits of this repair. A better understanding, ot repair wil allow ‘better 
estimates of the reliability and maintainability of high-reicon machines. The use of 
a plurality of high-relcon machines in a self-repeiring system should be explored. 
The reliability and meintainability leveie that can.be echieved by our various 
machines should be compared to the levels achieved by other mochinas: The 
network models and terminology we've developed for. embeded: machines can be 
refined, and new machine classes can be. identified and. studied. Our treatment of 
testing and repair for checkerboard errays can be extended to arrays with other 


_ interconnection networks. 
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